$\newcommand{\V}[1]{\mathbf #1}$ $\newcommand{\M}[1] {#1}$ $\newcommand{\Reals} { \mathbb{R} }$

# Chapter 10. Motion Planning in Higher Dimensions¶

The algorithms we have studied for the 2D case do not apply to many other systems of interest: quadcopters that avoid obstacles in 3D, articulated robot arms, multi-robot teams, and robots that manipulate objects. The main issue that we are faced with is that previous geometric planning methods require the ability to explicitly perform computations about the shape of obstacles. In higher-dimensional configuration spaces, this is rather challenging. This chapter will consider how to address problems with configuration spaces of arbitrary dimension. (In fact, these algorithms can also be applied to problems of lower dimension too --- in low-dimensional spaces, planning is fairly easy!)

Geometric planning methods like visibility graphs and cell decomposition can often be extended to 3D environments with polyhedral obstacles with some work. There are also algorithms for path planning with constraints expressed as semi-algebraic sets that work in spaces of arbitrarily high dimension, but their running time is exponential in dimensionality, and are more of an intellectual curiosity since they have never been practically implemented. Grid search planners can also be extended to higher dimensional grids, but they must in the worst case explore an exponential number of grid vertices.

In fact, it has been proven that even feasible path planning is NP-hard in the case of articulated $n$-joint robots. Surprisingly, optimal path planning in the presence of 3D polygonal obstacles is also NP-hard in the number of obstacle vertices! This dramatic increase in the complexity of exact algorithms in dimensions 3 and higher has led to the development of several approximate methods, which are discussed in detail in this chapter.

## Implicit C-obstacle representation¶

As before, the forbidden region is composed of a union of C-obstacles as described abpve. C-obstacles for general articulated robots are in general, even more complex than they were in 2D and 3D. As a result, most motion planning algorithms outside of planar environments do not attempt to build an explicit representation of C-obstacles, but instead opt to perform Boolean feasibility queries in which the collision status of a robot is queried for a given configuration: in other words, we can test whether $q\in \mathcal{C}O$ for a specific configuration $q$, but we do not have a representation of any points on $\partial \mathcal{C}O$. Specifically, the user of a planner defines a subroutine as follows: $$Feasible(q) = \left\{ \begin{array}{ll} T & \text{ if q is in the free space} \\ F & \text{ if q is in the forbidden region} \end{array} \right.$$ A planner can then call this subroutine to probe whether a configuration is feasible. Since this will be called thousands or millions of times, fast planning in high-dimensional spaces requires efficient collision tests as described in Chapter 7 .

Often, we will also need to check whether a motion is feasible as well, usually a short segment of a path $\overline{pq}$ between configurations $p,q \in \mathcal{C}$. The process is called a visibility query in the case of a straight line path, and can be a user-defined subroutine or performed by the planner. The is specified as follows: $$Visible(p,q) = \left\{ \begin{array}{ll} T & \text{ if \overline{pq} is completely inside the free space} \\ F & \text{ if \overline{pq} intersects the forbidden region} \end{array} \right.$$

In general the process of checking motions for collision is known as dynamic collision checking. The simplest method for doing so is simply to take small steps along the path and perform feasibility queries at each configuration. More details about this and other techniques are described in the section below.

In addition to the Boolean feasibility query computational model, we also consider some planners that exploit knowledge encoded in an implicit function model $\mathcal{C}O = \{ q \quad |\quad f(q) \leq 0 \}$. For example, one such implicit function $f$ may be the signed distance in workspace between the robot and obstacles. (Specifically, this would return the distance when there is no collision, and the negative of the penetration depth when collision exists). For most complex geometric models it is far more computationally expensive to perform distance and penetration depth computations than collision queries. As a result there is a trade off between using a computational query that provides richer information vs the added complexity of invoking the query.

## Grid Search and the Curse of Dimensionality¶

Let us begin by considering the case of extending grid search to $n$-D space. It is fairly straightforward to build such a grid, and collision checking for arbitrary $n$-D robots at configurations or paths can be performed relatively quickly (we shall describe methods for doing so below). However, the number of vertices that may need to be explored grows exponentially with the dimension of the space. This growth rapidly overwhelms the available computational resources, both in time and memory.

It is helpful to get a sense of the absolute scale of exponential increase to appreciate how difficult this makes the problem. Consider creating a grid in a 6-D unit hypercube $[0,1]^6$ with resolution $h$ on each axis. The number of vertices in the grid is listed in the right of the below table. Clearly, at high resolutions it would be impractical to search the entire grid.

Resolution $h$ # vertices
0.5 64
0.25 46,656
0.1 1,000,000
0.05 64,000,000
0.025 46,656,000,000
0.01 1,000,000,000,000

Let us also fix a relatively manageable resolution, say 0.1, and observe what happens as dimension varies. The following table shows how many vertices are in a grid of variable dimension $[0,1]^d$.

Dimension $d$ # vertices
2 100
3 1,000
6 1,000,000
8 100,000,000
10 10,000,000,000
15 1,000,000,000,000,000
20 100,000,000,000,000,000,000

Yikes! Even if feasibility checking and visibility checking were super-fast, this becomes impractical for use in dimensions of around 8. This problem is generally known as the curse of dimensionality.

Besides the combinatorial explosion in the number of grid cells needed to span a space, there are several other odd effects in high dimensional spaces that are counterintuitive to our experience in 2D and 3D spaces. Examples include the fact that the volume of a hypersphere drops dramatically as dimension increases. In fact the volume of a unit hypersphere approaches 0 as $d\rightarrow \infty$! This implies that almost all points are far in a high dimensional space, for most reasonable definitions of "far." Another effect is that the complexity of a polytope grows dramatically. Consider a polytope in $d$-D space with $n$ faces, such as that defined by a linear equality $A x \leq b$, with $A$ an $n \times d$ matrix and $b$ an $n$-vector. The number of vertices of the polytope is $O( {n \choose d } )$, which grows exponentially in $n$ and $d$. As a result, the complexity of the free space can be exponential even for simple constraint representations.

Since this "curse" appears so often in computational problems, it is of great interest (and often surprising) to find algorithms that circumvent these limitations. However, they tend to trade-off computational complexity for other limitations. One example is Monte-Carlo integration, in which a sum of function values of randomly sampled points are used to estimate the integral of a function: $$\int_a^b f(x) dx \approx \frac{b-a}{N} \sum_{i=1}^N f(x_i) \label{eq:MonteCarlo1D}$$ where the points $x_1,\ldots,x_N$ are sampled uniformly at random from the range $[a,b]$. The approximation error of this estimate, assuming $f$ is well-behaved, is on the order of $O((b-a)/\sqrt{N})$.

Monte-Carlo integration can be generalized to higher dimensional functions. If $B$ is an axis-aligned, $n$-dimensional box $[a_1,b_1]\times\cdots\times[a_n,b_n]$, then the integral over $B$ can be approximated $$\int_B f(x) dx \approx \frac{|B|}{N} \sum_{i=1}^N f(x_i) \label{eq:MonteCarloND}$$ with $|B|=\prod_{i=1}^n (b_i-a_i)$ the volume of $B$ and $x_1,\ldots,x_n$ is a uniform sampling over $B$. Specifically, a uniform sampling random variables $\epsilon_i \in [0,1]$ uniformly at random, and sets $$x_i = a_i + \epsilon_i (b_i - a_i).$$ The approximation error of ($\ref{eq:MonteCarloND}$) is very similar to the 1D case: $O(|B|/\sqrt{N})$. Observe that the dimensionality $n$ did not appear at all in this equation!

The sampling-based planning methods introduced in this chapter are somewhat inspired by Monte-Carlo integration. They are not directly affected by dimensionality, but trade off these computational benefits with a probabilistic chance of failure.

## Sampling-based motion planning¶

The most popular set of techniques for motion planning on robots with 5 or more DOFs is the class of sampling-based motion planners, most notably the probabilistic roadmap (PRMs) and rapidly-exploring random tree (RRTs) planners. All such techniques are roadmap methods that build a network of paths in C-space, but they use different strategies for doing so.

There are three general reasons for their popularity:

1. The same algorithm can be generalized to new problems of arbitrary dimensionality simply with changes of the $Feasible(q)$ and $Visible(p,q)$ subroutines.

2. They often produce paths quickly for high-dimensional problems that are not too-maze like, and given enough time can eventually solve problems of arbitrary complexity (probabilistic completeness).

3. They can be implemented fairly quickly by a student competent in algorithms and graph data structures.

By "high-dimensional" we mean that sampling-based planners can routinely solve problems in spaces of approximately 10D, and with tuning (or luck) can also find feasible paths in dozens or even hundreds of dimensions.

Probabilistic roadmaps (PRM) are an approximate roadmap of the robot's free space built by randomly sampling free connections and attempting connections between them. The roadmap can then be searched as usual for a path from the start to the goal. The basic algorithm for constructing a PRM is as follows:

1. Sample $N$ configurations at random from the C-space (Figure 1.a--b).

2. Add all feasible configurations and the start and goal to the roadmap. These are known as milestones. (Figure 1.c)

3. Test pairs of nearby milestones for visibility, and add visible edges to the roadmap. (Figure 1.d--e)

4. Search for a path from the start to the goal. (Figure 1.f)

Figure 1. Illustrating the main steps of the PRM algorithm.

It can be shown that the method is probabilistically complete, in that if it finds a path, the path will be feasible. If it does not find a path, then this answer might be incorrect. However, the chance of this incorrect answer decreases to 0 as $N$ increases, given some relatively mild assumptions on the shape of the free space. Another useful property is that the likelihood of success is not directly dependent on dimensionality, but rather then visibility properties of the free space. As a result, it can reliably solve high-dimensional problems that have good visibility properties, and perform poorly in low-dimensional problems that have poor visibility properties (such as narrow passages).

#### Algorithm and key parameters¶

The basic PRM algorithm is given in Algorithm Basic-PRM. The first for loop adds any sampled feasible configurations to the roadmap $(V,E)$. The second for loop checks for pairs of "nearby" milestones, and adds edges as long as the path between them is collision-free.

Algorithm Basic-PRM(s,g,N)

1. $V \gets \{ s, g \}$.
2. $E \gets \{ \}$.
3. for $i=1,\ldots,N$ do
4.    $q \gets Sample()$
5.    if not Feasible($q$) then return to Line 3.
6.    Add $q$ to $V$
7.    for all $p \in near (q,V)$
8.        if Visible($p,q$) then
9.            Add $(p,q)$ to $E$.
10. Search $G = (V,E)$, with Cartesian distance as the edge cost, to connect $s$ and $g$.
11. return the path if one is found.

There are two key subroutines to tune, which greatly affect running time and performance:

• The sampling distribution over $\mathcal{C}$ (the $Sample()$ subroutine in Line 4),

• The method for determining nearby milestones in Line 7.

Since at most $N$ milestones will be added to the roadmap, it is important to place as many as possible inside $\mathcal{F}$ at critical locations that aid the planner in connecting the start and goal. The $Sample()$ subroutine can be tuned to do so. First, in order to have a nonnegligible chance of sampling a milestone in a useful location, PRMs require that the C-space is bounded. In the most basic uniform distribution it is assumed that $\mathcal{C}$ is a box $[a_1,b_1]\times \cdots \times [a_n,b_n]$, and $Sample()$ samples configurations uniformly across the box. However, there are other methods for improving performance, which we shall discuss later.

If we were to check all edges in the roadmap, this would lead to a total of $O(N^2)$ visibility checks. This can be rather computationally expensive for large $N$, and long paths are much more likely to collide than short ones. Hence the idea of restricting visibility tests only to "nearby" points is a fast way of determining a small set of potential edges that have a better chance of being feasible. To do this, we need first to determine a distance metric $$d(p,q)$$ that measures some notion of path length. The simplest form might measure Euclidean distance, but for configuration spaces that blend translation and rotation it is often more suitable to use a weighted geodesic metric.

Once a metric is defined, most frequently one of two methods are employed to calculate the set of milestones "near" $q$:

• The $k$-nearest neighbors of $q$ in $V$.

• The $R$-neighborhood: all milestones in $V$ within some radius $R$ of $q$.

Using fast nearest neighbors data structures, which will be described later, the $k$-nearest neighbors can be computed in $O(k + \log N)$ time, and the $R$-neighborhood can be computed in $O(h + \log N)$ time, where $h$ is the number of points returned. In any case, this usually saves a huge amount of time because $k$, $h$, and $\log N$ are much smaller than $N$, and distance queries are fast. If we are careful not to double-check the reverse of an edge that has been checked before, at most $kN/2$ (or $hN/2$) edges will be checked in the roadmap.

#### Incremental variant¶

As written, the basic PRM algorithm places all of the (up to $N$) samples and edges into the roadmap first before performing search. But in easy problems, perhaps fewer samples were needed to construct a roadmap that contained a solution -- say, the first $M << N$. If all we wanted was a feasible path from $s$ to $g$, it would be foolish to continue adding unnecessary points! Fortunately, it is straightforward to implement an incremental variant of PRM with very little additional cost. (This is in fact the typical variant used for PRM planning queries.)

Figure 3. An incremental version of PRM adds more samples to the roadmap while maintaining the roadmap's connected components. It terminates once the start and goal are in the same component.

Algorithm Incremental-PRM gives an implementation of Incremental-PRM using a special data structure to determine each of the connected components of the graph. A connected component consists of all milestones that are mutually connected by any path.

Algorithm Incremental-PRM(s,g,T)

1. $V \gets \{ s, g \}$.
2. $E \gets \{ \}$.
3. $CC[s] \gets \{s\}$.
4. $CC[g] \gets \{g\}$.
5. while time-elapsed $< T$ do
6.    $q \gets Sample()$
7.    if not Feasible($q$) return to Line 3.
8.    Add $q$ to $V$
9.    $CC[q] \gets \{q\}$
10.    for all $p \in near (q,V)$
11.        if Visible($p,q$) then
12.            Add $(p,q)$ to $E$.
13.            Merge $CC[p]$ and $CC[q]$
14.            if $g \in CC[s]$
15.                return the shortest path in $G = (V,E)$ between $s$ and $g$.
16. return "no path"

Every time an isolated milestone gets added to the graph (Lines 3, 4, and 9), it gets assigned a connected component in the data structure $CC$. The connected components are maintained as more edges are added to the roadmap (Figure 3). Once $s$ and $g$ are in the same connected component (Line 14), the algorithm stops. In Line 5, the main loop is stopped by a time limit $T$ rather than an iteration limit. This means the overall running time can be controlled more precisely, which is useful if the robot needs to generate a path within a certain deadline.

To give more details about the $CC$ data structure, it can be thought of as a map from each milestone to its connected component: that is, $CC[v]$ set of milestones $w \in V$ that can be reached from $v$ via a path in the $G=(V,E)$. After each change of $G$, $CC$ is updated to reflect any changes in connected component. The difficulty is that each time an edge gets added, the connected components of those two points need to be merged (Line 13). If this were done in a naïve fashion (say, by storing a list of connected milestones per milestone), it could take an extremely long time ($O(|V|)$, where $|V|$ is the number of vertices currently in $V$) for each update. Fortunately, there is a special disjoint set (aka union-find) data structure that is very fast at maintaining these sets. With this data structure, it has been proven that a merge takes O($\alpha(|V|)$) time on average, where $\alpha(n)$ is a very, very slow growing function of $n$ called the inverse Ackermann function. It grows so slowly, in fact, that for all practical values of $n$ it is less than 5, and hence this can be considered a constant. Overall, to perform $|E|$ edge insertions the overhead introduced by this data structure is $O(|E| \alpha(|V|))$, which is essentially $O(|E|)$, and hence no additional asymptotic running time penalty is incurred by the incremental algorithm.

#### Empirical performance¶

The performance of PRMs in practice can be quite variable from problem to problem, and even run to run (depending on the initial seed of a random number generator). In typical problems, the probability of successfully finding a path increases as $N$ increases. After a "burn in" phase with low $N$ where there is no change of finding a solution, the probability of success rises fairly sharply before tapering off. The slope of this tapering off depends on the visibility characteristics of the problem, which we shall discuss below.

The average time needed for incremental PRM to solve a given problem depends on a number of factors. Firstly, efficient collision queries are essential, since thousands or millions of queries will be made during a "reasonably" sized run $(N=$1,000 to 100,000, usually). Second, the nearness criterion (either number of neighbors $k$ or radius $R$) should be set so that a small number of edges are checked, but not too small.

The path quality of PRM solutions can be quite variable. If there exists a suboptimal wide passage in $\mathcal{F}$ while the optimal passage is narrow, incremental PRM will find the suboptimal one first (with very high likelihood). Basic PRM will find the suboptimal one when $N$ is low, but does have a chance to find the optimal one once $N$ is large enough. There is also an issue of jerkiness of the produced path, which tends to be more pronounced with incremental PRM is used, or fewer neighbors are connected. We shall see strategies to address the jerkiness of PRM paths when discussion optimizing roadmaps and shortcutting.

### PRM variants¶

#### Rapidly-Exploring Random Trees (RRTs)¶

One of the most popular PRM variants is the Rapidly-Exploring Random Tree (RRT) algorithm, which grows a tree rather than a graph. Originally developed for kinodynamic planning, it is easily adapted to kinematic planning as well. The specific variant we will discuss is called RRT-Connect, which is a bidirectional search.

RRT-Connect grows two trees of feasible paths, one rooted at the start and the other at the goal. At each iteration, both the start and the goal trees are extended toward a randomly sampled configuration. Then, if the trees are close enough, a connection will be attempted between them. If connected, the joined trees contain a unique path from the start to the goal.

Algorithm RRT-Connect

1. $T_s \gets \{ s \}$.
2. $T_g \gets \{ g \}$.
3. for $i=1,...,N$ do
4.    $q_{rand} \gets Sample()$
5.    $q_e \gets$Extend-Tree$(T_s,q_{rand},\delta)$
6.    $q_e^\prime \gets$Extend-Tree$(T_g,q_{rand},\delta)$
7.    if $d(q_e,q_e^\prime) \leq \delta$ and Visible($q_e,q_e^\prime$)then
8.        Add edge $q_e\rightarrow q_e^\prime$ to connect $T_s$ and $T_g$
9.        return the path from $s$ to $g$
10. return "no path"

Algorithm Extend-Tree

1. $q_{near} \gets Nearest(T,q_{rand})$
2. $q \gets q_{near} + \min{1,\frac{\delta}{d(q_{rand},q_{near})}}(q_{rand}-q_{near})$
3. if Visible$(q_{near},q)$ then
4.    Add edge $q_{near}\rightarrow q$ to $T$.
5.    return$q$.
6. return $q_{near}$.

Specifically, the pseudocode is listed in Alg. RRT-Connect. $T_s$ and $T_g$ denote the trees rooted at the start and goal, respectively. In Line 3, a random configuration is drawn, and in Lines 4--5, the trees are extended toward it along a straight line path using the Extend-Tree subroutine. RRT has a key parameter $\delta$, which a limit to how far a tree can be extended on each step. In other words, every edge in each tree has length no more than $\delta$. Also, if the two extended milestones are within distance $\delta$, they are connected. For small values of $\delta$, it is more likely for each extension to succeed, but the tree makes slower progress in exploring the free space.

Code for $Extend-Tree$ is given in Alg. Extend-Tree. It first performs a nearest-neighbor query on the milestones in the given tree to determine a milestone $q_{near}$. It then extends a short path no more than distance $\delta$ toward the destination $q_{rand}$. If this edge is visible, then it is added to the tree.

Unlike PRMs, RRTs do not use the configurations coming from the $Sample()$ function directly, nor do they attempt more than one edge connection per iteration. Hence, they sample points in a different distribution than PRMs. But what is this distribution? We first introduce the concept of a Voronoï diagram, which defined for some set of points $X = \{\V{x}_1,\ldots,\V{x}_n\}$. The Voronoï diagram is a partition of the space $\mathcal{C}$ into Voronoï cells , one per point. The cell $C_i$ corresponding to a point $\V{x}_i$ is the subset of $\mathcal{C}$ for which $\V{x}_i$ is the closest point. In other words, $$C_i \equiv \{ \V{x} \in \mathcal{C}\quad | \quad i = \arg \min_{i=1,\ldots,n} d(\V{x},\V{x}_i) \}$$ RRT is said to employ a Voronoï bias strategy because each milestone in a tree is selected for expansion (i.e., be the nearest node to $q_{rand}$) with probability proportional to the volume of its Voronoï cell. This means that milestones that are closer to unexplored areas of $\mathcal{C}$ have a higher likelihood of being expanded. Moreover, the extended milestone will have a higher likelihood of extending the tree in unexplored directions (and hence the term rapidly exploring applies here).

RRTs are appealing because tree data structures are a bit simpler to implement than graphs. Also, the RRT explores locally first, so if the start and goal are nearby, the RRT may do significantly less work than a PRM. However, RRT performance is generally more sensitive to the choice of a distance metric, and is generally better at exploring than refining.

Figure 3 To escape the mouth of a bugtrap, an RRT needs to sample a very carefully chosen sequence of milestones within the general area that it has already explored. But due to the Voronoï bias, it frequently attempts infeasible extensions from the highlighted frontier nodes.

As an example, the "bugtrap" problem illustrated in Figure 3 tends to pose challenges for the RRT. In this and many other problems, a planner needs to strike a balance between exploration toward new areas and refinement of the roadmap in existing areas. Let's assume RRT only grows a tree from the start; it is easy to imagine double-bugtraps that cause the same behavior for the goal. Here, the has a very difficult time wiggling out of the opening of the trap because it appears purely from the Voronoï bias that the frontier has not yet been adequately explored. However, each attempted extension ends up bumping into the walls of the trap. A sequence of precisely-chosen values of $q_{rand}$ are needed to escape the trap, which is highly unlikely to occur by chance. In general it can be challenging to say whether an RRT or PRM will work better for a given problem.

#### Nonuniform sampling strategies¶

Since PRM and RRT performance depends highly on how well samples are placed in critical regions, several strategies have been developed to boost performance with nonuniform sampling. PRMs benefit from placing more samples in low-visibility regions, which requires identifying areas that are relatively constrained or close to obstacles. One way to do this is to record how many feasible and infeasible edges were attempted for each milestone (these are stored as counts $n_f[q]$ and $n_i[q]$, respectively, for each $q\in V$). After $N$ samples, more samples are added near the milestones with a large fraction of infeasible edges, with the hope that these milestones are located in low-visibility regions where a denser sampling is needed to make connections. Specifically, we might pick a milestone $q \in V$ with probability proportional to $n_i[q] / (n_i[q] + n_f[q])$ and then sample a new configuration from a disk centered at $q$ with radius $R$. If feasible, the new milestone is connected to the roadmap as usual.

Another method that can boost PRM performance in low-visibility spaces is the Gaussian sampling strategy. The idea is to increase the density of milestones near the boundaries of obstacles, since low-visibility regions will certainly be located near obstacles. The method actually draws two samples: one $q_1$ at random, and the second $q_2$ from a multivariate Gaussian distribution (see Appendix A.3.) centered at $q_1$ and with standard deviation $\sigma$. Then, only if exactly one of the samples is feasible, that sample is kept. Otherwise, both are thrown out. This ensures that the segment between $q_1$ and $q_2$ straddles the boundary between the free space and forbidden region.

It might seem odd to throw away perfectly good feasible samples, since adding them to the roadmap won't hurt (and can only help) connectivity. However, every additional milestone incurs additional work to test and connect edges. In fact, edge collision checking is often the dominant cost of planning. It turns out that in the presence of narrow passages, the added cost to generate samples is worth it, and Gaussian sampling can perform quite well. However, for best performance the perturbation standard deviation $\sigma$ must be tuned to trade off these competing costs.

RRTs benefit from a slight goal bias which drives the tree toward the goal. In RRT-Connect, this could take the form of sampling $q_{rand}$ from $T_{g}$ some fraction of the time, which would drive extensions of $T_{s}$ toward the goal. Similarly, the reverse search could sample $q_{rand}$ from $T_s$ some fraction of the time, and drive extensions from $T_g$ toward the start. This takes the form of replacing Lines 4--6 in Algorithm RRT-Connect with the following code:

1. if {$rand() \leq p_{gb}$}
2.    if {$rand() \leq 0.5$}
3.        $q_{e}^\prime \gets RandomVertex(T_g)$
4.        $q_{e} \gets$Extend-Tree$(T_s,q_{e}^\prime,\delta)$
5.    else
6.        $q_{e} \gets RandomVertex(T_s)$
7.        $q_{e} \gets$Extend-Tree$(T_g,q_{e},\delta)$
8. Perform Lines 4--6 as usual

Here the term $p_{gb}$ in Line 1 is the probability of using goal biasing, and Line 2 decides according to an unbiased coin flip whether to extend toward the start or toward the goal. The function $rand()$ samples from the uniform distribution on $[0,1]$.

#### Multi-query PRMs¶

Another variant of PRMs that is useful in some scenarios is the multi-query PRM. As presented, the PRM method finds a path for a given $(s,g)$ query and then throws out the roadmap for the next query. In some cases, we would like the robot to plan another path in the same environment. Or, a team of robots may be traversing the same environment. In this case, it makes sense to precompute a good PRM and then reuse it for multiple queries. This is because the primary cost of PRM planning is in the construction phase, while the graph search phase is quite fast.

PRM construction proceeds like before, but without any endpoints. Then, to query the existing roadmap for a start and goal $(s,g)$, we try connecting $s$ and $g$ to nearby milestones using visibility queries. Then, the augmented PRM is searched for a path. To keep the roadmap from growing if many queries are to be made, $s$, $g$, and all the edges connecting them are removed from the roadmap before terminating the query.

TODO

### Shortcutting¶

As noted above, PRM and RRT are only concerned with finding a feasible path, and often produce jerky, unnatural paths. Shortcutting is a very useful postprocessing heuristic, illustrated in Figure 4, in which portions of a path are repeatedly replaced with shorter, feasible segments.

Figure 4. A shortcutting heuristic can quickly smooth out the jerkiest parts of paths produced by a sampling-based planner.

In order to do so, two random points are sampled along the path. With the path being a curve $y(s):[0,1]\rightarrow \mathcal{C}$, we sample two parameters $u,v \in [0,1]$. If $Visible(y(u),y(v))$ is true, then we replace the portion of the path between $u$ and $v$ with the straight line path. Otherwise, the sampling process begins again. This repeats for some number of iterations.

Shortcutting is only a local optimization technique, and not a very powerful one at that. But it is very fast, and this low overhead makes it a very practical method for getting rid of the worst parts of a jerky trajectory. In fact, we can construct an any-time planner that simply applies repeated restarts of an RRT (or PRM) followed by shortcutting. The shortest path found after shortcutting is maintained through each of these restarts. Eventually, we might get lucky and find a path close to optimal. It turns out that for many problems, this approach can outperform RRT* (or PRM*)!

TODO

TODO

### Common pitfalls in employing PRMs¶

Sampling-based motion planning is appealing since it can be implemented for a wide variety of problems by non-planning experts. However, there are several issues that can cause PRM's to fail. What is often frustrating is that the PRM will not provide a rationale for failure! It just appears that it just "doesn't work." Some of the most common pitfalls encountered when implement PRMs and their variants are:

• Improper handling of non-Euclidean topology of $\mathcal{C}$ in the distance metric $d(p,q)$ and dynamic collision checking function.

• Improper scaling of the C-space / badly scaled distance thresholds $R$ or $\delta$.

• Providing infeasible start and goal configurations.

• Providing start and goal configurations in "deep pockets": passages narrowing down as the endpoint is approached.

• Incorrect feasibility tests.

• Applying a planner when the free space volume is negligible, or narrow passages are infinitely thin.

When debugging, it is often extremely useful to extract and visually debug the roadmap produced by a planner. This helps diagnose problems like the planner taking tiny steps, not expanding the roadmap at all, or detecting phantom obstacles. This can be tricky in high-dimensional spaces, since visualization must ultimately take place on a 2D display, and a roadmap may contain thousands of configurations and edges.

To handle topology, it is extremely important to ensure that the notion of a "straight line path" in dynamic collision checking interpolates nearly along a geodesic, and that the distance metric is relatively close to a geodesic distance. When orientations are present, if this issue were neglected and the C-space were treated as Euclidean, then small positive rotations would never be connected to small negative rotations. This will manifest itself as artifacts in which the robot will either fail to find a path, or will rotate in an unnecessarily long fashion.

For choosing thresholds, a rule of thumb is to start by setting $R$ and $\delta$ to be approximately 10% of the diameter of the C-space. The values can then be fine-tuned to achieve better performance on a given problem. A good rule of thumb is to aim to achieve approximately 5--15 connections per milestone. This tends to work well for setting the value of $k$ when $k$-nearest neighbors is used as the nearness criterion in PRM.

The infeasible endpoint problem is often encountered when there is a bit of error in the world model or the robot's sensing of its configuration, and the robot starts or ends at a configuration that is in contact (or close to it.) There are two approaches to handling this: before planning, adjust the world model so that the robot is collision free (which can be hard), or slightly perturb $s$ and $g$ to new configurations $s^\prime$ and $g^\prime$ that are collision free with respect to the robot's current knowledge. Then, the path is planned between $s^\prime$ and $g^\prime$, and the robot executes path $s\rightarrow s^\prime \rightsquigarrow g^\prime \rightarrow g$. This, however, assumes that the path to the perturbed configurations is actually feasible in the real world, which requires a bit of care.

The deep pocket problem is faced particularly often in manipulation or docking, in which the start or goal has the robot touching the obstacle, and must make a careful, coordinated maneuver to leave it. For example, when the robot is grasping an object, its fingers are touching both sides of the object, and the hand must slide carefully in or out of position without moving laterally or rotating about certain axes. Hence, the passage is quite narrow in at least 2 or 3 dimensions! In these pockets of free space, the robot must take shorter steps, and most directions of travel lead to collision. However, once the pocket is escaped (like when the hand is away from a grasped object), then large steps can again be taken. In other words, visibility is nonuniform across $\mathcal{F}$. There are three general ways of handling this issue, all of which require studying the manipulation problem more carefully:

1. Manually define a short docking/undocking maneuver that inserts into / retracts from the pocket. This could be, for example in manipulation, a Cartesian move that places the gripper in front of the object with fingers wide open. The inverse of this maneuver is used to determine the start and goal points for the planner.

2. Start a tree-growing planner like RRT from the constrained endpoint with small step size. After some time, the farthest node from the start is assumed to have wiggled out of the pocket and point-to-point planning can begin from that new endpoint.

3. Develop an obstacle-sliding local planner or extension method that allows the planner to generate motions that slide against obstacles.

It is easy to make bugs when defining feasibility tests, particularly in more complex problems where feasibility requires passing many conditions. This is problematic because the subroutine is the only representation the planner has about the free space, so it needs to accurately reproduce the C-obstacles of the problem or else the planner will produce paths that will collide, or fail to find a solution where one obviously exists. There are some newer techniques that search for a small set of problematic C-obstacles blocking the way, which can help debug incorrect settings (Hauser 2012). But perhaps the first approach to try is to capture statistics during planning to detect the frequency at which each condition passes and fails inside the test. Some motion planning libraries will do this automatically and ask the user to define individual conditions, but in others this is up to the user. If a test never fails (or always passes) this suggests an obvious implementation bug.

Finally, the free space must not contain a contain a non-neglible volume of space (that is, $\mu(\mathcal{F}) / \mu(\mathcal{C})> 0$). This condition may be produced when a constraint is introduced (like an IK constraint, or a constraint that two objects must touch, or that a joint must take on a particular value) that leaves all feasible configurations on a manifold of lower-dimensionality of space. In these cases, the PRM will not be able to generate samples with non-neglible probability. One approach to handle this problem is to parameterize the solution manifold explicitly. Extensions of PRMs are also available to properly handle manifold constraints without a need for parameterization; these techniques generate samples by projecting them onto the feasible manifold, and also constructing paths that move along the manifold. These will be discussed later... (TODO: where will manipulation planning be added?)

## Incomplete Methods¶

In addition to the above methods that satisfy some notion of completeness, there are additional methods based on optimization techniques that are incomplete: they have no guarantee of finding a feasible path when one exists. They can, however, generally produce paths quickly when they do work.

### Potential fields¶

Potential fields are a well-studied technique that works using only local information to guide the robot's movement, and is therefore quite fast, making it appropriate for real-time obstacle avoidance as well as path planning in relatively simple spaces.

The general idea is to consider the robot's configuration as being a particle in some energy potential landscape. Due to "gravity" the particle will feel some virtual forces equal to the negative of the gradient of this landscape. If the landscape is constructed to have a global minimum at the goal, then by following the gradient the particle will, hopefully, arrive at the goal.

To construct such a landscape, the usual method is to combine an attractive potential field in which the force is some gain constant $k_{att}$ times the vector pointing direction to the goal: $$P_{att}(q) = \frac{1}{2}k_{att} \| q - q_g \|^2$$ along with a repulsive potential generating a repulsive force for each obstacle. The repulsive force is chosen to grow larger (typically toward infinity) as the robot gets closer to the obstacle. Some limiting distance $\rho_0$ is typically chosen where the effect of an obstacle drops off to 0. One such function is the following: $$P_{rep}(q) = \left\lbrace \begin{array}{ll} \frac{1}{2}k_{rep}(1/\rho(q) - 1/\rho_0)^2 & \text{If }\rho(q) \leq \rho_0\\ 0 & \text{If }\rho(q) > \rho_0 \end{array}\right.$$ Here $\rho(q)$ is a function that measures the workspace distance between the robot and the obstacle, and $k_{rep}$ modulates the strength of the force. The potential is infinity at $\rho(q)=0$ and drops down to 0 at $\rho(q) = \rho_0$. (Note that here we must be able to calculate distance rather than just Boolean collision detection.)

The force acting on the particle is the negated gradient of each potential: $$f_{att}(q) = -k_{att} (q - q_g)$$ and $$f_{rep}(q) = \left\lbrace \begin{array}{ll} k_{rep} (\frac{1}{\rho(q)} - \frac{1}{\rho_0})\frac{1}{\rho(q)^2} \frac{\partial \rho(q)}{\partial q} & \text{If }\rho(q) \leq \rho_0\\ 0 & \text{If }\rho(q) > \rho_0 \end{array}\right.$$

Then, to evolve the configuration over time as a particle in this potential field, we use an iterative approach. At the current time step, the robot is at position $q_t$. The next point along the path is given by: $$q_{t+1} = q_t + \frac{\Delta t}{m}(f_{att}(q) + f_{rep}(q))$$ where $m$ is a virtual "mass" of the robot and $\Delta t$ is the time step. One potential issue with this method is that the magnitude of the force vector can be highly varying, from 0 at the goal to infinity at the boundary of an obstacle. To avoid huge jumps (or little movement at all) in the path it makes sense to dynamically set the mass to be proportional to the magnitude of the force. In this way, consistent rate of progress is ensured as the path evolves.

This method works well when the robot simply needs to move slightly away from a straight line path to avoid obstacles, provided that obstacles are relatively convex and spatially distant. Its main advantages are 1) speed of computation, and 2) only local information about obstacles is needed. However, like other local methods it is prone to local minima caused either by concave obstacles, or narrow passages where the repulsive forces from either side of the passage cancel out the attractive force.

### Trajectory optimization¶

Trajectory optimization is another potential-based method that optimizes the overall shape the robot's path to minimize cost. Unlike potential fields, for which the optimization variable is the configuration is a single point in time, trajectory optimization uses some parameterization of the entire path as the optimization variable. This helps it avoid future obstacles and, in some cases, avoid local minima that potential fields would fall prey to.

Such methods begins with the definition of some fixed number of path parameters $\theta \in \mathbb{R}^N$ which dictate the shape of a candidate path. One example, for piecewise linear paths passing between the start and the goal configuration, is simply the set of intermediate milestones: $$\theta = (q_1,\ldots,q_{k-1})$$ In this case, the path $y(s)$ consists of $k$ straight-line segments, interpolating between milestones $q_0=q_s, q_1, \ldots, q_{k-1}, q_k=q_g$. Any value of $\theta$ dictates the shape of some path, and any piecewise linear path with $k$ segments corresponds to some value of $\theta$. To make this dependence clear, we shall refer to the path defined by some value $\theta$ as $y_\theta$.

If the dimension of C-space is $d$, then $N = d(k-1)$. Hence the trajectory optimization problem can be quite high dimensional (hundreds or thousands of dimensions) even for C-spaces with a moderate number of dimensions.

Next, we must encode the objective function and constraints. For minimizing path length, it may be tempting to initially define the following cost function that minimizes path length: $$f(\theta) = \sum_{i=1}^k \| q_i - q_{i-1} \|.$$ However, this formulation has the drawback that it is not differentiable when two milestones are equal, and also has a null direction when three milestones are on the straight line. It is more numerically convenient to minimize the sum of squared distances $$f(\theta) = \sum_{i=1}^k \| q_i - q_{i-1} \|^2.$$ which, if a $k$-segment piecewise linear path is indeed optimal, is minimized when the path (nearly) has minimum length and the milestones are evenly spaced.

Now let us proceed to defining constraints, which we assume are in the form $g(q) \leq 0$. At first glance, one might choose to simply enforce constraints on the milestones: $$h(\theta) = \begin{bmatrix}{g(q_1)} \\ {\vdots} \\ {g(q_{k-1})} \end{bmatrix} \leq 0.$$ However, this runs the risk of having two milestones on either side of an obstacle, with the intermediate segment crossing the obstacle. Instead, we must consider the possibility of constraint violations between milestones. A straightforward way to do so is to use collocation points, which are points along the path at which constraints will be enforced.

Specifically we can define some number $M$ of collocation points at parameters $s_1,\ldots,s_M \in [0,1]$, usually evenly distributed along the parameter space $[0,1]$. The $j$'th collocation point lies on a segment indexed by $i_j \in \{1,\ldots,k\}$ and lies a fraction $u_j \in [0,1]$ along the straight line segment, where these are determined so that the configuration at the collocation point is: $$y_\theta(s_j) = q_{i_j-1} + u_j (q_{i_j} - q_{i_j-1}).$$ We then define many inequality constraints on $\theta$ so that constraints at each collocation point are enforced: $$h(\theta) = \begin{bmatrix}{g(y_\theta(s_1))} \\ {\vdots} \\ {g(y_\theta(s_M))} \end{bmatrix} \leq 0.$$ The resulting problem is a constrained optimization problem (Appendix B.3.), which can be solved using a nonlinear program solver, like Sequential Quadratic Programming (SQP). Efficient implementations will take advantage of sparseness in the constraint Jacobian.

Another alternative lets us use unconstrained optimizations (Appendix B.3.) by converting hard constraints to penalties in the objective function. In this approach we defining a penalty function for violating constraints: $$f_{pen}(\theta) = \sum_{j=1}^M \max(g(y_\theta(s_j)), 0).$$ Then by minimizing a weighted objective function $$f(\theta) + w f_{pen}(\theta)$$ using standard nonlinear optimization techniques (e.g., Quasi-Newton methods), portions of the path for which constraints are violated will be pushed out of the C-obstacle. However, if $w$ is not set to be sufficiently high, then the optimizer of the weighted objective function will still slightly overlap with the obstacle. To address this, we can progressively increase $w$ to reduce the amount of overlap. To prevent overlap altogether, we can also allow the constraint violation penalty to extend a distance $\gamma > 0$ outside the region where the constraint is violated. $$f_{pen}(\theta; \gamma) = \sum_{j=1}^M \max(g(y_\theta(s_j)), -\gamma) + \gamma.$$

Regardless of whether a constrained or unconstrained approach is taken, there are two major issues with trajectory optimization:

• The computational cost of optimization depends strongly on the number of path parameters and collocation points. If too few path parameters are chosen then a feasible path may not be found; if too few collocation points are chosen then the path may violate constraints.

• For complex environments, the potential landscape in $\theta$ space is littered with local minima (and typically, more minima appear as the granularity $k$ of the path grows).

The problem of choosing collocation points can be addressed by adaptively identifying the point along the path with maximum constraint violation, in advanced optimization techniques known as constraint generation or semi-infinite programming.

The local minimum problem can be partially addressed either by initializing the optimizer with a path from some other motion planning method, like a sampling-based planner, or by using global optimization techniques. The approach of seeding an optimizer by a sampling-based planner is fast and often works well. However, does not guarantee a globally optimal path, because the planner may have produced a seed path in a suboptimal homotopy class or basin of attraction. Global optimization may result in better paths, but can be extraordinarily slow, particularly in high dimensional spaces.

## Summary¶

Key takeaways:

• Sampling-based motion planners can overcome some limitations of the curse of dimensionality. However, they pay a cost in the variance of solution quality and running time.

• The running time of such planners is dependent on the visibility characteristics of the free space, which does not directly relate to dimensionality. Running times will be fast in spaces of good visibility.

• Probabilistic roadmaps (PRMs) and Rapidly-Exploring Random Trees (RRTs) are the most widely used classes of such planners. There are many variations on the basic structure.

• Shortcutting can be employed in postprocessing to achieve fast (but local) improvements in path quality. To achieve global improvements, optimizing variants of PRMs and RRTs are available.

• Potential field methods use only local information to determine a direction of movement and are extremely fast. They can work well for real-time obstacle avoidance, but are prone to local minima.

• Trajectory optimization methods simultaneously optimize milestones along an entire trajectory. However, they require a choice of the number of milestones used to represent a path, and are also prone to local minima.