Adaptive Identification of Legged Robotic Kinematic Structure
Bolun Dai
Masters Project 2019
Paper

Model-based control usually relies on an accurate model, which is often obtained from CAD and actuator models. The more accurate the model the better the control performance. However, in bipedal robots that demonstrate high agility actions, such as running and hopping, the robot hardware will suffer from impacts with the environment and deform in vulnerable parts, which invalidates the predefined model. Thus, it is desired to have an adaptable kinematic structure that takes deformation into consideration. To account for this we propose an approach that models all of the robotic joints as 6-DOF joints and develop an algorithm that can identify the kinematic structure from motion capture data. We evaluate the algorithm’s performance both in simulation - a three link pendulum, and on a bipedal robot - ATRIAS. In the simulated case the algorithm produces a result that has a 3.6% error compared to the ground truth, and on the real life bipedal robot the algorithm’s result confirms our prior assumption where the joint deform on out-of-plane degrees of freedom. In addition our algorithm is able to predict torque and forces using the reconstructed joint mode.

State Constrained Stochastic Optimal Control Using LSTM's
Bolun Dai, Prashanth Krishnamurthy, Andrew Papanicolaou, Farshad Khorrami
American Control Conference (ACC) 2021
Paper | Video | Poster

In this paper, we propose a new methodology for state constrained stochastic optimal control (SOC) problems. The solution is based on past work in solving SOC problems using forward-backward stochastic differential equations (FBSDE). Our approach in solving the FBSDE utilizes a deep neural network (DNN), specifically Long Short-Term Memory (LSTM) networks. LSTMs are chosen to solve the FBSDE to address the curse of dimensionality, non-linearities, and long time horizons. In addition, the state constraints are incorporated using a hard penalty function, resulting in a controller that respects the constraint boundaries. Numerical instability that would be introduced by the penalty function is dealt with through an adaptive update scheme. The control design methodology is applicable to a large class of control problems. The performance and scalability of our proposed algorithm are demonstrated by numerical simulations.

Learning Locomotion Controllers for Walking Using Deep FBSDE
Bolun Dai, Virinchi Roy Surabhi, Prashanth Krishnamurthy, Farshad Khorrami
Preprint 2021
Paper

In this paper, we propose a deep forward-backward stochastic differential equation (FBSDE) based control algorithm for locomotion tasks. We also include state constraints in the FBSDE formulation to impose stable walking solutions or other constraints that one may want to consider (e.g., energy). Our approach utilizes a deep neural network (i.e., LSTM) to solve, in general, high-dimensional Hamilton-Jacobi-Bellman (HJB) equation resulting from the stated optimal control problem. As compared to traditional methods, our proposed method provides a higher computational efficiency in real-time; thus yielding higher frequency implementation of the closed-loop controllers. The efficacy of our approach is shown on a linear inverted pendulum model (LIPM) for walking. Even though we are deploying a simplified model of walking, the methodology is applicable to generalized and complex models for walking and other control/optimization tasks in robotic systems. Simulation studies have been provided to show the effectiveness of the proposed methodology.

Learning a Better Control Barrier Function
Bolun Dai, Prashanth Krishnamurthy, Farshad Khorrami
Conference on Decision and Control (CDC) 2022
Paper | Video

Control barrier functions (CBF) are widely used in safety-critical controllers. However, the construction of valid CBFs is well known to be challenging, especially for nonlinear or non-convex constraints and high relative degree systems. On the other hand, finding a conservative CBF that only recovers a portion of the true safe set is usually possible. In this work, starting from a “conservative” handcrafted control barrier function (HCBF), we develop a method to find a control barrier function that recovers a reasonably larger portion of the safe set. Using a different approach, by incorporating the hard constraints into an optimal control problem, e.g., MPC, we can safely generate solutions within the true safe set. Nevertheless, such an approach is usually computationally expensive and may not lend itself to real-time implementations. We propose to combine the two methods. During training, we utilize MPC to collect safe trajectory data. Thereafter, we train a neural network to estimate the difference between the HCBF and the CBF that recovers a closer solution to the true safe set. Using the proposed approach, we can generate a safe controller that is less conservative and computationally efficient. We validate our approach on three systems: a second-order integrator, ball-on-beam, and unicycle.

Data-Efficient Control Barrier Function Refinement
Bolun Dai, Heming Huang, Prashanth Krishnamurthy, Farshad Khorrami
American Control Conference (ACC) 2023
Paper

Control barrier functions (CBFs) have been widely used for synthesizing controllers in safety-critical applications. When used as a safety filter, it provides a simple and computationally efficient way to obtain safe controls from a possibly unsafe performance controller. Despite its conceptual simplicity, constructing a valid CBF is well known to be challenging, especially for high-relative degree systems under nonconvex constraints. Recently, work has been done to learn a valid CBF from data based on a handcrafted CBF (HCBF). Even though the HCBF gives a good initialization point, it still requires a large amount of data to train the CBF network. In this work, we propose a new method to learn more efficiently from the collected data through a novel prioritized data sampling strategy. A priority score is computed from the loss value of each data point. Then, a probability distribution based on the priority score of the data points is used to sample data and update the learned CBF. Using our proposed approach, we can learn a valid CBF that recovers a larger portion of the true safe set using a smaller amount of data. The effectiveness of our method is demonstrated in simulation on a unicycle and a two-link arm.

Safe Navigation and Obstacle Avoidance Using Differentiable Optimization Based Control Barrier Functions
Bolun Dai, Rooholla Khorrambakht, Prashanth Krishnamurthy, Vinícius Gonçalves, Anthony Tzes, Farshad Khorrami
IEEE Robotics and Automation Letters
Paper | Video

Control barrier functions (CBFs) have been widely applied to safety-critical robotic applications. However, the construction of control barrier functions for robotic systems remains a challenging task. Recently, collision detection using differentiable optimization has provided a way to compute the minimum uniform scaling factor that results in an intersection between two convex shapes and to also compute the Jacobian of the scaling factor. In this paper, we propose a framework that uses this scaling factor, with an offset, to systematically define a CBF for obstacle avoidance tasks. We provide a theoretical analysis that proves the continuity of the proposed CBF. Empirically, we show that the proposed CBF is continuously differentiable, and the resulting optimal control problem is computationally efficient, which makes it applicable for real-time robotic control. We validate our approach, first using a 2D mobile robot example, then on the Franka-Emika Research 3 (FR3) robot manipulator both in simulation and experiment.

State Constrained Stochastic Optimal Control for Continuous and Hybrid Dynamical Systems Using DFBSDE
Bolun Dai, Andrew Papanicolaou, Prashanth Krishnamurthy, Farshad Khorrami
Automatica
Paper

We develop a computationally efficient learning-based forward-backward stochastic differential equations (FBSDE) controller for both continuous and hybrid dynamical (HD) systems subject to stochastic noise and state constraints. Solutions to stochastic optimal control (SOC) problems satisfy the Hamilton–Jacobi–Bellman (HJB) equation. Using current FBSDE-based solutions, the optimal control can be obtained from the HJB equations using deep neural networks (e.g., long short-term memory (LSTM) networks). To ensure the learned controller respects the constraint boundaries, we enforce the state constraints using a soft penalty function. In addition to previous works, we adapt the deep FBSDE (DFBSDE) control framework to handle HD systems consisting of continuous dynamics and a deterministic discrete state change. We demonstrate our proposed algorithm in simulation on a continuous nonlinear system (cart-pole) and a hybrid nonlinear system (five-link biped).

Minigrid & Miniworld - Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks
Maxime Chevalier-Boisvert, Bolun Dai, Mark Towers, Rodrigo de Lazcano, Lucas Willems, Salem Lahlou, Suman Pal, Pablo Samuel Castro, Jordan Terry
NeurIPS 2023 Dataset and Benchmark Track
Paper

We present the Minigrid and Miniworld libraries, which provide a suite of goal-oriented 2D and 3D environments. The libraries were explicitly created with a minimalistic design paradigm to allow users to rapidly develop new environments for a wide range of research-specific needs. As a result, both have received widescale adoption by the RL community, facilitating research in a wide range of areas. In this paper, we outline the design philosophy, environment details, and their world generation API. We also showcase the additional capabilities brought by the unified API between Minigrid and Miniworld through case studies on transfer learning (for both RL agents and humans) between the different observation spaces. The source code of Minigrid and Miniworld can be found at https://github.com/Farama-Foundation/{Minigrid, Miniworld} along with their documentation at https://{minigrid, miniworld}.farama.org/.

DiffOcclusion - Differentiable Optimization Based Control Barrier Functions for Occlusion-Free Visual Servoing
Shiqing Wei, Bolun Dai, Rooholla Khorrambakht, Prashanth Krishnamurthy, Farshad Khorrami
IEEE Robotics and Automation Letters
Paper

The visibility (possibly partial) of some image features is crucial to a broad class of visual servoing-based control. In this work, we consider the setting of image-based visual servoing (IBVS) and address the fundamental problem of keeping a moving object with an unknown motion profile in the field of view while ensuring it remains unobstructed by obstacles. Assuming that the projections of the target and obstacles are both convex polygons, we propose a systematic method for circumscribing these polygons by strictly convex shapes with tunable accuracy. We prove that the minimal scaling factor such that two convex shapes intersect is continuously differentiable with respect to their vertex coordinates. Then, we formulate a control barrier function (CBF) based on this minimal scaling factor and incorporate a motion observer into occlusion-free visual servoing. The effectiveness of our method is validated through both simulation studies and experimental validation on the Franka Research 3 robotic arm.

Sailing Through Point Clouds - Safe Navigation Using Point Cloud Based Control Barrier Functions
Bolun Dai, Rooholla Khorrambakht, Prashanth Krishnamurthy, Farshad Khorrami
IEEE Robotics and Automation Letters
Paper | Video

The capability to navigate safely in an unstructured environment is crucial when deploying robotic systems in real-world scenarios. Recently, control barrier function (CBF) based approaches have been highly effective in synthesizing safety-critical controllers. In this work, we propose a novel CBF-based local planner comprised of two components: Vessel and Mariner. The Vessel is a novel scaling factor based CBF formulation that synthesizes CBFs using only point cloud data. The Mariner is a CBF-based preview control framework that is used to mitigate getting stuck in spurious equilibria during navigation. To demonstrate the efficacy of our proposed approach, we first compare the proposed point cloud based CBF formulation with other point cloud based CBF formulations. Then, we demonstrate the performance of our proposed approach and its integration with global planners using experimental studies on the Unitree B1 and Unitree Go2 quadruped robots in various environments.