Instilling moral value alignment by means of multi-objective reinforcement learning

Rodriguez Soto, Manel; Serramià Amorós, Marc; López Sánchez, Maite; Rodríguez-Aguilar, Juan A. (Juan Antonio)

Instilling moral value alignment by means of multi-objective reinforcement learning

AI research is being challenged with ensuring that autonomous agents learn to behave ethically, namely in alignment with moral values. Here, we propose a novel way of tackling the value alignment problem as a two-step process. The first step consists on formalising moral values and value aligned beh...

Full description

Bibliographic Details
Authors:	Rodriguez Soto, Manel, Serramià Amorós, Marc, López Sánchez, Maite, Rodríguez-Aguilar, Juan A. (Juan Antonio)
Format:	article
Status:	Published version
Publication Date:	2022
Country:	España
Institution:	Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
Repository:	Recercat. Dipósit de la Recerca de Catalunya
OAI Identifier:	oai:recercat.cat:2445/192920
Online Access:	https://hdl.handle.net/2445/192920
Access Level:	Open access
Keyword:	Intel·ligència artificial Aprenentatge per reforç (Intel·ligència artificial) Ètica Aspectes morals Artificial intelligence Reinforcement learning Ethics Moral aspects

id	ES_544fdd2a86b248d8df2da0da3f740f5c
oai_identifier_str	oai:recercat.cat:2445/192920
network_acronym_str	ES
network_name_str	España
repository_id_str
spelling	Instilling moral value alignment by means of multi-objective reinforcement learningRodriguez Soto, ManelSerramià Amorós, MarcLópez Sánchez, MaiteRodríguez-Aguilar, Juan A. (Juan Antonio)Intel·ligència artificialAprenentatge per reforç (Intel·ligència artificial)ÈticaAspectes moralsArtificial intelligenceReinforcement learningEthicsMoral aspectsAI research is being challenged with ensuring that autonomous agents learn to behave ethically, namely in alignment with moral values. Here, we propose a novel way of tackling the value alignment problem as a two-step process. The first step consists on formalising moral values and value aligned behaviour based on philosophical foundations. Our formalisation is compatible with the framework of (Multi-Objective) Reinforcement Learning, to ease the handling of an agent's individual and ethical objectives. The second step consists in designing an environment wherein an agent learns to behave ethically while pursuing its individual objective. We leverage on our theoretical results to introduce an algorithm that automates our two-step approach. In the cases where value-aligned behaviour is possible, our algorithm produces a learning environment for the agent wherein it will learn a value-aligned behaviour.Springer2023202320222023info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersion17 p.application/pdfhttps://hdl.handle.net/2445/192920Articles publicats en revistes (Matemàtiques i Informàtica)reponame:Recercat. Dipósit de la Recerca de Catalunyainstname:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)InglésReproducció del document publicat a: https://doi.org/10.1007/s10676-022-09635-0Ethics And Information Technology, 2022, vol. 24https://doi.org/10.1007/s10676-022-09635-0cc by (c) Manel Rodríguez Soto et al., 2022http://creativecommons.org/licenses/by/3.0/es/info:eu-repo/semantics/openAccessoai:recercat.cat:2445/1929202026-05-29T05:05:01Z
dc.title.none.fl_str_mv	Instilling moral value alignment by means of multi-objective reinforcement learning
title	Instilling moral value alignment by means of multi-objective reinforcement learning
spellingShingle	Instilling moral value alignment by means of multi-objective reinforcement learning Rodriguez Soto, Manel Intel·ligència artificial Aprenentatge per reforç (Intel·ligència artificial) Ètica Aspectes morals Artificial intelligence Reinforcement learning Ethics Moral aspects
title_short	Instilling moral value alignment by means of multi-objective reinforcement learning
title_full	Instilling moral value alignment by means of multi-objective reinforcement learning
title_fullStr	Instilling moral value alignment by means of multi-objective reinforcement learning
title_full_unstemmed	Instilling moral value alignment by means of multi-objective reinforcement learning
title_sort	Instilling moral value alignment by means of multi-objective reinforcement learning
dc.creator.none.fl_str_mv	Rodriguez Soto, Manel Serramià Amorós, Marc López Sánchez, Maite Rodríguez-Aguilar, Juan A. (Juan Antonio)
author	Rodriguez Soto, Manel
author_facet	Rodriguez Soto, Manel Serramià Amorós, Marc López Sánchez, Maite Rodríguez-Aguilar, Juan A. (Juan Antonio)
author_role	author
author2	Serramià Amorós, Marc López Sánchez, Maite Rodríguez-Aguilar, Juan A. (Juan Antonio)
author2_role	author author author
dc.subject.none.fl_str_mv	Intel·ligència artificial Aprenentatge per reforç (Intel·ligència artificial) Ètica Aspectes morals Artificial intelligence Reinforcement learning Ethics Moral aspects
topic	Intel·ligència artificial Aprenentatge per reforç (Intel·ligència artificial) Ètica Aspectes morals Artificial intelligence Reinforcement learning Ethics Moral aspects
description	AI research is being challenged with ensuring that autonomous agents learn to behave ethically, namely in alignment with moral values. Here, we propose a novel way of tackling the value alignment problem as a two-step process. The first step consists on formalising moral values and value aligned behaviour based on philosophical foundations. Our formalisation is compatible with the framework of (Multi-Objective) Reinforcement Learning, to ease the handling of an agent's individual and ethical objectives. The second step consists in designing an environment wherein an agent learns to behave ethically while pursuing its individual objective. We leverage on our theoretical results to introduce an algorithm that automates our two-step approach. In the cases where value-aligned behaviour is possible, our algorithm produces a learning environment for the agent wherein it will learn a value-aligned behaviour.
publishDate	2022
dc.date.none.fl_str_mv	2022 2023 2023 2023
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	https://hdl.handle.net/2445/192920
url	https://hdl.handle.net/2445/192920
dc.language.none.fl_str_mv	Inglés
language_invalid_str_mv	Inglés
dc.relation.none.fl_str_mv	Reproducció del document publicat a: https://doi.org/10.1007/s10676-022-09635-0 Ethics And Information Technology, 2022, vol. 24 https://doi.org/10.1007/s10676-022-09635-0
dc.rights.none.fl_str_mv	cc by (c) Manel Rodríguez Soto et al., 2022 http://creativecommons.org/licenses/by/3.0/es/ info:eu-repo/semantics/openAccess
rights_invalid_str_mv	cc by (c) Manel Rodríguez Soto et al., 2022 http://creativecommons.org/licenses/by/3.0/es/
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	17 p. application/pdf
dc.publisher.none.fl_str_mv	Springer
publisher.none.fl_str_mv	Springer
dc.source.none.fl_str_mv	Articles publicats en revistes (Matemàtiques i Informàtica) reponame:Recercat. Dipósit de la Recerca de Catalunya instname:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
instname_str	Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
reponame_str	Recercat. Dipósit de la Recerca de Catalunya
collection	Recercat. Dipósit de la Recerca de Catalunya
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_	1869408175908716544
score	15.81155

Instilling moral value alignment by means of multi-objective reinforcement learning

Similar Items