CM3 framework for deep multi-agent reinforcement learning in football

Patel, Shivani

CM3 framework for deep multi-agent reinforcement learning in football

Collaboration amongst agents in various multi-agent cooperative and mixed environments has been extensively studied in the field of Deep Multi-Agent Reinforcement Learning. This cooperative behavior and roles emerging out of such cooperation could be beneficial for the agents collectively when they...

Descripción completa

Detalles Bibliográficos
Autor:	Patel, Shivani
Tipo de recurso:	tesis de maestría
Fecha de publicación:	2023
País:	España
Institución:	Universitat Politècnica de Catalunya (UPC)
Repositorio:	UPCommons. Portal del coneixement obert de la UPC
Idioma:	inglés
OAI Identifier:	oai:upcommons.upc.edu:2117/407134
Acceso en línea:	https://hdl.handle.net/2117/407134
Access Level:	acceso abierto
Palabra clave:	Multiagent systems Reinforcement learning Deep Multi-Agent RL Curriculum Learning Unity ML-Agents Sistemes multiagent Aprenentatge per reforç Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial

Descripción
Sumario:	Collaboration amongst agents in various multi-agent cooperative and mixed environments has been extensively studied in the field of Deep Multi-Agent Reinforcement Learning. This cooperative behavior and roles emerging out of such cooperation could be beneficial for the agents collectively when they align their individual objectives towards a common goal, share resources effectively, and communicate efficiently to optimize their combined efforts. Research spans across various sub-areas, namely communication in MARL (Comm-MARL), intrinsic rewards, exploration in MARL, curriculum learning, reward shaping, and emergent behavior. Cooperative Multi-Goal Multi-Stage Multi-Agent RL, abbreviated as CM3 is one such framework that uses curriculum learning and a specialized policy function to tackle the issues of efficient exploration and credit assignment respectively. It has been tested on 3 multi-agent environments to demonstrate its power by learning significantly faster than direct adaptations of existing algorithms. As part of this thesis, we have hypothesized if the domain of football from a multi-agent perspective benefits from CM3. Taking notes from the intersection of reinforcement learning and football, and some of the current state-of-the-art football algorithms, such as TiKick and WeKick which are based primarily on PPO, we see how actor-critic algorithms like A2C and PPO compare when used in our multi-agent environment. For this demonstration, we have leveraged a modified version of the Unity ML-Agents' SoccerTwos environment. We also propose an additional enhancement to the original CM3 framework by extending the training further to a 3rd stage when the reward is independent of the goal. We hypothesized that it could enhance coordination because there'd be a single common, collective goal for the team - to win the match - as opposed to the individual goals of scoring or saving.

CM3 framework for deep multi-agent reinforcement learning in football

Similares en LA Referencia