DodgeDrone-Challenge [COMPLETED]
About the Challenge
The DodgeDrone-Challenge was held in conjunction with the ICRA Workshop on Perception and Action in Dynamic Environments.
Our submission
I teamed up with Huiyu for our submission to this challenge and we ended up in 2nd place!
Our implementation used the deep Reinforcement Learning (deep RL) method, Proximal Policy Optimization (PPO), for our drone controls. Because PPO is not very sample-efficient, we only trained the drone to station-keep/hover at a fixed goal location. To navigate waypoints, we then transform the position of the drone such that the desired waypoint is the fixed goal position.
The waypoints are decided by our local planner which simply interpolates between the current drone location and the goal from the high-level planner. For object avoidance, we used images from the onboard depth camera to detect nearby obstacles. If there is an obstacle present along or near the path, then the local planner replans the waypoints.
Our code is available on our github repo and the docker image of our submission is on Docker Hub at the repo linncy/ddc:submission-Team-LhylC.
Afterthoughts
The main goal of this project was to get hands-on experience in the development of deep RL for robotics and this goal is largely met. I managed to familiarise myself with common libraries OpenAI (and their gyms) and stable-baselines3. I am also learned a lot about state-of-the-art deep RL methods like PPO and SAC.
Although we got second place, our placing isn’t that impressive given the small number of submissions. The winners also had significantly better performance than ours which is expected from their complex algorithms. Moving forward, we hope to continue working with this problem statement to improve our method’s performance.
Notably, our submission does not leverage the strengths of RL. I hope to expand the use of RL more into planning and obstacle avoidance instead of controls. Having said that, using RL for controls does seem to have at least one benefit: it does not require hand-tuning. I had also been working on a PID controller for the drone but did not manage to tune it in time for the submission. PID/PD tuning is time-consuming and difficult. Thus, I will consider automatic PID/PD tuning in future works.