27638 Neuro-Evolutionary Deep Reinforcement Learning for Robotic Control

Probleemstelling:

Deep reinforcement learning (DRL) is a rapidly evolving research track in the field of machine learning, often indicated as the technology that is expected to revolutionize the field of artificial intelligence (AI). One of the primary goals of the field of AI is to develop fully autonomous agents that learn optimal behavior by interacting with their environments and improve over time through trial and error. A mathematical framework for such experience-driven autonomous learning is known as reinforcement learning (RL). Although RL had some successes in the past, previous approaches lacked scalability (due to complexity issues), which was limiting their applicability to relatively low-dimensional problems that are rarely capturing complex real-life scenarios. The rise of deep learning provided new tools to overcome these problems resulting in a powerful new framework – deep reinforcement learning. DRL has already demonstrated remarkable results in various applications, ranging from playing video games to indoor navigation and with various agents including robots and softbots. For example, control policies for robots can be learned directly from camera inputs in the real world.


Figure 1. The perception-action-learning loop. Source: Kai Arulkumaran et al. Deep Reinforcement Learning, Signal Processing Magazine, November 2017.

                   
Figure 2. Examples of DRL applications from gym.openai.com/ (Space invaders, Ant, and LunarLander)

Although deep learning brings many advantages, there are some drawbacks it brings as well. Deep learning’s flexibility and scalability are its best, and simultaneously, its worst qualities. The flexibility and scalability make it possible to use deep learning in many different scenarios and use cases, but with it come a large number of hyperparameters, such as the number of hidden layers, the number of nodes within each layer, the connectivity between two layers, the type of layer, the used activation function per layer, the used loss function... All of these hyperparameters need to be tuned to optimize the function approximation. Neuro-evolution is a form of artificial intelligence that uses evolutionary algorithms to generate artificial neural networks. It is mostly used in artificial life, general game playing, and evolutionary robotics. It gradually builds up an artificial neural network by applying evolutionary methods like mutations and crossover. Some neuro-evolution algorithms have proven themselves to be able to compete with deep reinforcement learning algorithms, the most prominent being NEAT (Neuro-Evolution for Augmenting Topologies). This method manages to find a very efficient network topology without using any gradients, so it does not suffer from issues like the exploding or diminishing gradient problem. 


Doelstelling:

The first goal of this thesis is to study and understand well the principles involved in deep reinforcement learning as well as neuro-evolutionary algorithms. Secondly, the student should implement and empirically evaluate the existing deep RL architectures making use of the OpenAI Gym toolkit for developing and comparing reinforcement learning algorithms. As the main task, the student should use neuro-evolutionary methods instead of traditional neural networks to use as the reinforcement learning agent’s function approximation. The possible artificial neural network variants consist of the following: a traditional neural network, an implementation of NEAT (Neuro-Evolution of Augmenting Topologies), and an implementation of EANT (Evolutionary Acquisition of Neural Topologies).  Different ways to prioritize the awards could be introduced and then used with the existing algorithms to solve different tasks. These tasks will be chosen in agreement with the student.  At the beginning of the semester, a minicourse will be organized for a student to put him/her on the right track and to familiarize him/her with the topic. Furthermore, the existing code and literature will be made available to the student.

References:

  1. K. Arulkumaran et al. Deep Reinforcement Learning: A brief Survey, 2017.
  2. Gym: A toolkit for developing and comparing reinforcement learning algorithms. https://gym.openai.com/docshttps://github.com/openai/gym
  3. A. Choudhary. A hands-on introduction to Deep Q-learning using OpenAI Gym in Python, 2019, https://www.analyticsvidhya.com/blog/2019/04/introduction-deep-q-learning-python/