Transferring knowledge as heuristics in reinforcement learning: A case-based approach
The goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). Thi...
| Autores: | , , , , |
|---|---|
| Tipo de recurso: | artículo |
| Estado: | Versión enviada para evaluación y publicación |
| Fecha de publicación: | 2015 |
| País: | España |
| Institución: | Consejo Superior de Investigaciones Científicas (CSIC) |
| Repositorio: | DIGITAL.CSIC. Repositorio Institucional del CSIC |
| OAI Identifier: | oai:dnet:digitalcsic_::fc26750ba49af6fc566128c0d57b5353 |
| Acceso en línea: | http://hdl.handle.net/10261/130283 |
| Access Level: | acceso abierto |
| Palabra clave: | Learning process Humanoid robots Empirical evaluations Case-based approach 2D simulations Reinforcement learning Heuristic methods Meta-algorithms Case-based reasoning Anthropomorphic robots Transfer learning Target domain |
| id |
ES_3e75fb0b718971a89400a2c6d261c335 |
|---|---|
| oai_identifier_str |
oai:dnet:digitalcsic_::fc26750ba49af6fc566128c0d57b5353 |
| network_acronym_str |
ES |
| network_name_str |
España |
| repository_id_str |
|
| spelling |
Transferring knowledge as heuristics in reinforcement learning: A case-based approachBianchi, ReinaldoCeliberto, Luiz A.Santos, Paulo E.Matsuura, Jackson P.López de Mántaras, RamónLearning processHumanoid robotsEmpirical evaluationsCase-based approach2D simulationsReinforcement learningHeuristic methodsMeta-algorithmsCase-based reasoningAnthropomorphic robotsTransfer learningTarget domainThe goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). This meta-algorithm works in three stages: first, it uses a Reinforcement Learning step to learn a task on the source domain, storing the knowledge thus obtained in a case base; second, it does an unsupervised mapping of the source-domain actions to the target-domain actions; and, third, the case base obtained in the first stage is used as heuristics to speed up the learning process in the target domain. A set of empirical evaluations were conducted in two target domains: the 3D mountain car (using a learned case base from a 2D simulation) and stability learning for a humanoid robot in the Robocup 3D Soccer Simulator (that uses knowledge learned from the Acrobot domain). The results attest that our transfer learning algorithm outperforms recent heuristically-accelerated reinforcement learning and transfer learning algorithms. © 2015 Elsevier B.V.Luiz Celiberto Jr. and Reinaldo Bianchi acknowledge the support of FAPESP (grants 2012/14010-5 and 2011/19280-8). Paulo E. Santos acknowledges support from FAPESP (grant 2012/04089-3) and CNPq (grant PQ2 -303331/2011-9).Peer ReviewedElsevierConsejo Superior de Investigaciones Científicas [https://ror.org/02gfc7t72]2016201620152016info:eu-repo/semantics/articlehttp://purl.org/coar/resource_type/c_6501Preprintinfo:eu-repo/semantics/submittedVersionhttp://hdl.handle.net/10261/130283reponame:DIGITAL.CSIC. Repositorio Institucional del CSICinstname:Consejo Superior de Investigaciones Científicas (CSIC)InglésSíinfo:eu-repo/semantics/openAccessoai:dnet:digitalcsic_::fc26750ba49af6fc566128c0d57b53532026-05-22T06:33:51Z |
| dc.title.none.fl_str_mv |
Transferring knowledge as heuristics in reinforcement learning: A case-based approach |
| title |
Transferring knowledge as heuristics in reinforcement learning: A case-based approach |
| spellingShingle |
Transferring knowledge as heuristics in reinforcement learning: A case-based approach Bianchi, Reinaldo Learning process Humanoid robots Empirical evaluations Case-based approach 2D simulations Reinforcement learning Heuristic methods Meta-algorithms Case-based reasoning Anthropomorphic robots Transfer learning Target domain |
| title_short |
Transferring knowledge as heuristics in reinforcement learning: A case-based approach |
| title_full |
Transferring knowledge as heuristics in reinforcement learning: A case-based approach |
| title_fullStr |
Transferring knowledge as heuristics in reinforcement learning: A case-based approach |
| title_full_unstemmed |
Transferring knowledge as heuristics in reinforcement learning: A case-based approach |
| title_sort |
Transferring knowledge as heuristics in reinforcement learning: A case-based approach |
| dc.creator.none.fl_str_mv |
Bianchi, Reinaldo Celiberto, Luiz A. Santos, Paulo E. Matsuura, Jackson P. López de Mántaras, Ramón |
| author |
Bianchi, Reinaldo |
| author_facet |
Bianchi, Reinaldo Celiberto, Luiz A. Santos, Paulo E. Matsuura, Jackson P. López de Mántaras, Ramón |
| author_role |
author |
| author2 |
Celiberto, Luiz A. Santos, Paulo E. Matsuura, Jackson P. López de Mántaras, Ramón |
| author2_role |
author author author author |
| dc.contributor.none.fl_str_mv |
Consejo Superior de Investigaciones Científicas [https://ror.org/02gfc7t72] |
| dc.subject.none.fl_str_mv |
Learning process Humanoid robots Empirical evaluations Case-based approach 2D simulations Reinforcement learning Heuristic methods Meta-algorithms Case-based reasoning Anthropomorphic robots Transfer learning Target domain |
| topic |
Learning process Humanoid robots Empirical evaluations Case-based approach 2D simulations Reinforcement learning Heuristic methods Meta-algorithms Case-based reasoning Anthropomorphic robots Transfer learning Target domain |
| description |
The goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). This meta-algorithm works in three stages: first, it uses a Reinforcement Learning step to learn a task on the source domain, storing the knowledge thus obtained in a case base; second, it does an unsupervised mapping of the source-domain actions to the target-domain actions; and, third, the case base obtained in the first stage is used as heuristics to speed up the learning process in the target domain. A set of empirical evaluations were conducted in two target domains: the 3D mountain car (using a learned case base from a 2D simulation) and stability learning for a humanoid robot in the Robocup 3D Soccer Simulator (that uses knowledge learned from the Acrobot domain). The results attest that our transfer learning algorithm outperforms recent heuristically-accelerated reinforcement learning and transfer learning algorithms. © 2015 Elsevier B.V. |
| publishDate |
2015 |
| dc.date.none.fl_str_mv |
2015 2016 2016 2016 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article http://purl.org/coar/resource_type/c_6501 Preprint info:eu-repo/semantics/submittedVersion |
| format |
article |
| status_str |
submittedVersion |
| dc.identifier.none.fl_str_mv |
http://hdl.handle.net/10261/130283 |
| url |
http://hdl.handle.net/10261/130283 |
| dc.language.none.fl_str_mv |
Inglés |
| language_invalid_str_mv |
Inglés |
| dc.relation.none.fl_str_mv |
Sí |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.publisher.none.fl_str_mv |
Elsevier |
| publisher.none.fl_str_mv |
Elsevier |
| dc.source.none.fl_str_mv |
reponame:DIGITAL.CSIC. Repositorio Institucional del CSIC instname:Consejo Superior de Investigaciones Científicas (CSIC) |
| instname_str |
Consejo Superior de Investigaciones Científicas (CSIC) |
| reponame_str |
DIGITAL.CSIC. Repositorio Institucional del CSIC |
| collection |
DIGITAL.CSIC. Repositorio Institucional del CSIC |
| repository.name.fl_str_mv |
|
| repository.mail.fl_str_mv |
|
| _version_ |
1869406541073874944 |
| score |
15.81155 |