Transferring knowledge as heuristics in reinforcement learning: A case-based approach

The goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). Thi...

Descripción completa

Detalles Bibliográficos
Autores: Bianchi, Reinaldo, Celiberto, Luiz A., Santos, Paulo E., Matsuura, Jackson P., López de Mántaras, Ramón
Tipo de recurso: artículo
Estado:Versión enviada para evaluación y publicación
Fecha de publicación:2015
País:España
Institución:Consejo Superior de Investigaciones Científicas (CSIC)
Repositorio:DIGITAL.CSIC. Repositorio Institucional del CSIC
OAI Identifier:oai:dnet:digitalcsic_::fc26750ba49af6fc566128c0d57b5353
Acceso en línea:http://hdl.handle.net/10261/130283
Access Level:acceso abierto
Palabra clave:Learning process
Humanoid robots
Empirical evaluations
Case-based approach
2D simulations
Reinforcement learning
Heuristic methods
Meta-algorithms
Case-based reasoning
Anthropomorphic robots
Transfer learning
Target domain
id ES_3e75fb0b718971a89400a2c6d261c335
oai_identifier_str oai:dnet:digitalcsic_::fc26750ba49af6fc566128c0d57b5353
network_acronym_str ES
network_name_str España
repository_id_str
spelling Transferring knowledge as heuristics in reinforcement learning: A case-based approachBianchi, ReinaldoCeliberto, Luiz A.Santos, Paulo E.Matsuura, Jackson P.López de Mántaras, RamónLearning processHumanoid robotsEmpirical evaluationsCase-based approach2D simulationsReinforcement learningHeuristic methodsMeta-algorithmsCase-based reasoningAnthropomorphic robotsTransfer learningTarget domainThe goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). This meta-algorithm works in three stages: first, it uses a Reinforcement Learning step to learn a task on the source domain, storing the knowledge thus obtained in a case base; second, it does an unsupervised mapping of the source-domain actions to the target-domain actions; and, third, the case base obtained in the first stage is used as heuristics to speed up the learning process in the target domain. A set of empirical evaluations were conducted in two target domains: the 3D mountain car (using a learned case base from a 2D simulation) and stability learning for a humanoid robot in the Robocup 3D Soccer Simulator (that uses knowledge learned from the Acrobot domain). The results attest that our transfer learning algorithm outperforms recent heuristically-accelerated reinforcement learning and transfer learning algorithms. © 2015 Elsevier B.V.Luiz Celiberto Jr. and Reinaldo Bianchi acknowledge the support of FAPESP (grants 2012/14010-5 and 2011/19280-8). Paulo E. Santos acknowledges support from FAPESP (grant 2012/04089-3) and CNPq (grant PQ2 -303331/2011-9).Peer ReviewedElsevierConsejo Superior de Investigaciones Científicas [https://ror.org/02gfc7t72]2016201620152016info:eu-repo/semantics/articlehttp://purl.org/coar/resource_type/c_6501Preprintinfo:eu-repo/semantics/submittedVersionhttp://hdl.handle.net/10261/130283reponame:DIGITAL.CSIC. Repositorio Institucional del CSICinstname:Consejo Superior de Investigaciones Científicas (CSIC)InglésSíinfo:eu-repo/semantics/openAccessoai:dnet:digitalcsic_::fc26750ba49af6fc566128c0d57b53532026-05-22T06:33:51Z
dc.title.none.fl_str_mv Transferring knowledge as heuristics in reinforcement learning: A case-based approach
title Transferring knowledge as heuristics in reinforcement learning: A case-based approach
spellingShingle Transferring knowledge as heuristics in reinforcement learning: A case-based approach
Bianchi, Reinaldo
Learning process
Humanoid robots
Empirical evaluations
Case-based approach
2D simulations
Reinforcement learning
Heuristic methods
Meta-algorithms
Case-based reasoning
Anthropomorphic robots
Transfer learning
Target domain
title_short Transferring knowledge as heuristics in reinforcement learning: A case-based approach
title_full Transferring knowledge as heuristics in reinforcement learning: A case-based approach
title_fullStr Transferring knowledge as heuristics in reinforcement learning: A case-based approach
title_full_unstemmed Transferring knowledge as heuristics in reinforcement learning: A case-based approach
title_sort Transferring knowledge as heuristics in reinforcement learning: A case-based approach
dc.creator.none.fl_str_mv Bianchi, Reinaldo
Celiberto, Luiz A.
Santos, Paulo E.
Matsuura, Jackson P.
López de Mántaras, Ramón
author Bianchi, Reinaldo
author_facet Bianchi, Reinaldo
Celiberto, Luiz A.
Santos, Paulo E.
Matsuura, Jackson P.
López de Mántaras, Ramón
author_role author
author2 Celiberto, Luiz A.
Santos, Paulo E.
Matsuura, Jackson P.
López de Mántaras, Ramón
author2_role author
author
author
author
dc.contributor.none.fl_str_mv Consejo Superior de Investigaciones Científicas [https://ror.org/02gfc7t72]
dc.subject.none.fl_str_mv Learning process
Humanoid robots
Empirical evaluations
Case-based approach
2D simulations
Reinforcement learning
Heuristic methods
Meta-algorithms
Case-based reasoning
Anthropomorphic robots
Transfer learning
Target domain
topic Learning process
Humanoid robots
Empirical evaluations
Case-based approach
2D simulations
Reinforcement learning
Heuristic methods
Meta-algorithms
Case-based reasoning
Anthropomorphic robots
Transfer learning
Target domain
description The goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). This meta-algorithm works in three stages: first, it uses a Reinforcement Learning step to learn a task on the source domain, storing the knowledge thus obtained in a case base; second, it does an unsupervised mapping of the source-domain actions to the target-domain actions; and, third, the case base obtained in the first stage is used as heuristics to speed up the learning process in the target domain. A set of empirical evaluations were conducted in two target domains: the 3D mountain car (using a learned case base from a 2D simulation) and stability learning for a humanoid robot in the Robocup 3D Soccer Simulator (that uses knowledge learned from the Acrobot domain). The results attest that our transfer learning algorithm outperforms recent heuristically-accelerated reinforcement learning and transfer learning algorithms. © 2015 Elsevier B.V.
publishDate 2015
dc.date.none.fl_str_mv 2015
2016
2016
2016
dc.type.none.fl_str_mv info:eu-repo/semantics/article
http://purl.org/coar/resource_type/c_6501
Preprint
info:eu-repo/semantics/submittedVersion
format article
status_str submittedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/10261/130283
url http://hdl.handle.net/10261/130283
dc.language.none.fl_str_mv Inglés
language_invalid_str_mv Inglés
dc.relation.none.fl_str_mv
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Elsevier
publisher.none.fl_str_mv Elsevier
dc.source.none.fl_str_mv reponame:DIGITAL.CSIC. Repositorio Institucional del CSIC
instname:Consejo Superior de Investigaciones Científicas (CSIC)
instname_str Consejo Superior de Investigaciones Científicas (CSIC)
reponame_str DIGITAL.CSIC. Repositorio Institucional del CSIC
collection DIGITAL.CSIC. Repositorio Institucional del CSIC
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1869406541073874944
score 15.81155