Learning Non-Stationary Sampling Distributions for Robot Motion Planning

Ratnesh Madaan*, Sam Zeng*, Brian Okorn, Rogerio Bonatti
*Equal Contribution
16-831 Statistical Techniques in Robotics & 16-782 Planning and Decision-making in Robotics, Fall 2017.
Report, Videos, Slides - 16-782, Slides - 16-831,
Code - Modified OMPL (C++), Code - Training Pipeline (Python)

Here, we learn an adapting sampling distribution for RRTs(Rapidly Exploring Random Trees), which is conditioned on both the workspace environment and the instantaneous planning graph by using CVAEs(Conditional Variational Auto-Encoders). This project is inspired by two recent papers : Ichter et. al, Learning Sampling Distributions for Robot Motion Planning (ICRA 2018) and Bhardwaj et al., Search as Imitation Learning (CoRL 2017).
Ichter et al. learn a static distribution by conditioning on the workspace environment. Bhardwaj et al. learn an adaptive heuristic policy for search based planners, by imitating an oracle which has full access to the world at train time. We combined these two ideas, by conditioning the distribution both on the planning graph and the environment, and used DAGGer for training the policy.

The first key challenge here is to define what is the optimal sample given a partially solved planning tree. Given the tree corresponding to a partially solved planning problem, we generate a set of optimal samples which have collision free connections to the tree, and have low cost to goal starting from the sample (obtained by computing backward Dijkstra distance from goal to start). The second challenge lies in extracting a feature representation from the ever increasing in size random tree. For the purposes of the course project, we used a 2-stream convolutional CVAE which takes both the image of the environment and the image of the instantaneous tree (!) and outputs the optimal sample.

Although we demonstrated results on a 2D holonomic problem, we are working towards generalizing our framework for higher dimensional planning problems, and will be submitting to CoRL 2018.

In each video, the start state is bottom left of image, the goal state is top right of the image, blue shoes the tree progress, the distribution is shown by fitting a Kernel Density Estimator on samples from the CVAE. The sample itself is seen in yellow

Implementation Notes: We exposed OMPL’s RRT to a Python via an interface, such that it can send the graph to Python at each iteration, where a CVAE uses the graph and the environment, does its magic and returns the sample, which is then used by OMPL’s RRT.