Exploring reward strategies for wind turbine pitch control by reinforcement learning

In this work, a pitch controller of a wind turbine (WT) inspired by reinforcement learning (RL) is designed and implemented. The control system consists of a state estimator, a reward strategy, a policy table, and a policy update algorithm. Novel reward strategies related to the energy deviation fro...

Descripción completa

Detalles Bibliográficos
Autores: Sierra-García, Jesús Enrique, Santos Peñas, Matilde
Tipo de recurso: artículo
Fecha de publicación:2020
País:España
Institución:Universidad Complutense de Madrid (UCM)
Repositorio:Docta Complutense
Idioma:inglés
OAI Identifier:oai:docta.ucm.es:20.500.14352/112247
Acceso en línea:https://hdl.handle.net/20.500.14352/112247
Access Level:acceso abierto
Palabra clave:Intelligent control
Pitch control
Wind turbines
Wind energy
Reinforcement learning
Reward strategies
Inteligencia artificial (Informática)
1203.04 Inteligencia Artificial
id ES_d134e7da2f68ea434e2a1d58a35342ce
oai_identifier_str oai:docta.ucm.es:20.500.14352/112247
network_acronym_str ES
network_name_str España
repository_id_str
spelling Exploring reward strategies for wind turbine pitch control by reinforcement learningSierra-García, Jesús EnriqueSantos Peñas, MatildeIntelligent controlPitch controlWind turbinesWind energyReinforcement learningReward strategiesInteligencia artificial (Informática)1203.04 Inteligencia ArtificialIn this work, a pitch controller of a wind turbine (WT) inspired by reinforcement learning (RL) is designed and implemented. The control system consists of a state estimator, a reward strategy, a policy table, and a policy update algorithm. Novel reward strategies related to the energy deviation from the rated power are defined. They are designed to improve the efficiency of the WT. Two new categories of reward strategies are proposed: “only positive” (O-P) and “positive-negative” (P-N) rewards. The relationship of these categories with the exploration-exploitation dilemma, the use of ϵ-greedy methods and the learning convergence are also introduced and linked to the WT control problem. In addition, an extensive analysis of the influence of the different rewards in the controller performance and in the learning speed is carried out. The controller is compared with a proportional-integral-derivative (PID) regulator for the same small wind turbine, obtaining better results. The simulations show how the P-N rewards improve the performance of the controller, stabilize the output power around the rated power, and reduce the error over time.MDPIUniversidad Complutense de Madrid20202020-01-0120202020-01-01journal articlehttp://purl.org/coar/resource_type/c_6501info:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/20.500.14352/112247reponame:Docta Complutenseinstname:Universidad Complutense de Madrid (UCM)Inglésengopen accesshttp://purl.org/coar/access_right/c_abf2Attribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/openAccessoai:docta.ucm.es:20.500.14352/1122472026-06-02T12:44:21Z
dc.title.none.fl_str_mv Exploring reward strategies for wind turbine pitch control by reinforcement learning
title Exploring reward strategies for wind turbine pitch control by reinforcement learning
spellingShingle Exploring reward strategies for wind turbine pitch control by reinforcement learning
Sierra-García, Jesús Enrique
Intelligent control
Pitch control
Wind turbines
Wind energy
Reinforcement learning
Reward strategies
Inteligencia artificial (Informática)
1203.04 Inteligencia Artificial
title_short Exploring reward strategies for wind turbine pitch control by reinforcement learning
title_full Exploring reward strategies for wind turbine pitch control by reinforcement learning
title_fullStr Exploring reward strategies for wind turbine pitch control by reinforcement learning
title_full_unstemmed Exploring reward strategies for wind turbine pitch control by reinforcement learning
title_sort Exploring reward strategies for wind turbine pitch control by reinforcement learning
dc.creator.none.fl_str_mv Sierra-García, Jesús Enrique
Santos Peñas, Matilde
author Sierra-García, Jesús Enrique
author_facet Sierra-García, Jesús Enrique
Santos Peñas, Matilde
author_role author
author2 Santos Peñas, Matilde
author2_role author
dc.contributor.none.fl_str_mv Universidad Complutense de Madrid
dc.subject.none.fl_str_mv Intelligent control
Pitch control
Wind turbines
Wind energy
Reinforcement learning
Reward strategies
Inteligencia artificial (Informática)
1203.04 Inteligencia Artificial
topic Intelligent control
Pitch control
Wind turbines
Wind energy
Reinforcement learning
Reward strategies
Inteligencia artificial (Informática)
1203.04 Inteligencia Artificial
description In this work, a pitch controller of a wind turbine (WT) inspired by reinforcement learning (RL) is designed and implemented. The control system consists of a state estimator, a reward strategy, a policy table, and a policy update algorithm. Novel reward strategies related to the energy deviation from the rated power are defined. They are designed to improve the efficiency of the WT. Two new categories of reward strategies are proposed: “only positive” (O-P) and “positive-negative” (P-N) rewards. The relationship of these categories with the exploration-exploitation dilemma, the use of ϵ-greedy methods and the learning convergence are also introduced and linked to the WT control problem. In addition, an extensive analysis of the influence of the different rewards in the controller performance and in the learning speed is carried out. The controller is compared with a proportional-integral-derivative (PID) regulator for the same small wind turbine, obtaining better results. The simulations show how the P-N rewards improve the performance of the controller, stabilize the output power around the rated power, and reduce the error over time.
publishDate 2020
dc.date.none.fl_str_mv 2020
2020-01-01
2020
2020-01-01
dc.type.none.fl_str_mv journal article
http://purl.org/coar/resource_type/c_6501
dc.type.openaire.fl_str_mv info:eu-repo/semantics/article
format article
dc.identifier.none.fl_str_mv https://hdl.handle.net/20.500.14352/112247
url https://hdl.handle.net/20.500.14352/112247
dc.language.none.fl_str_mv Inglés
eng
language_invalid_str_mv Inglés
language eng
dc.rights.none.fl_str_mv open access
http://purl.org/coar/access_right/c_abf2
Attribution-NonCommercial-NoDerivatives 4.0 International
http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.openaire.fl_str_mv info:eu-repo/semantics/openAccess
rights_invalid_str_mv open access
http://purl.org/coar/access_right/c_abf2
Attribution-NonCommercial-NoDerivatives 4.0 International
http://creativecommons.org/licenses/by-nc-nd/4.0/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv MDPI
publisher.none.fl_str_mv MDPI
dc.source.none.fl_str_mv reponame:Docta Complutense
instname:Universidad Complutense de Madrid (UCM)
instname_str Universidad Complutense de Madrid (UCM)
reponame_str Docta Complutense
collection Docta Complutense
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1869420242251284480
score 15,81155