Weighted contrastive divergence

Learning algorithms for energy based Boltzmann architectures that rely on gradient descent are in general computationally prohibitive, typically due to the exponential number of terms involved in computing the partition function. In this way one has to resort to approximation schemes for the evaluat...

Descripción completa

Detalles Bibliográficos
Autores: Romero Merino, Enrique|||0000-0003-2404-5716, Mazzanti Castrillejo, Fernando Pablo|||0000-0001-6641-0609, Delgado Pin, Jordi|||0000-0003-4546-8355, Buchaca Prats, David
Tipo de recurso: artículo
Fecha de publicación:2019
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2117/133368
Acceso en línea:https://hdl.handle.net/2117/133368
https://dx.doi.org/10.1016/j.neunet.2018.09.013
Access Level:acceso abierto
Palabra clave:Neural networks (Computer science)
Machine learning
Restricted Boltzmann machine
Contrastive divergence
Xarxes neuronals (Informàtica)
Aprenentatge automàtic
Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial
Descripción
Sumario:Learning algorithms for energy based Boltzmann architectures that rely on gradient descent are in general computationally prohibitive, typically due to the exponential number of terms involved in computing the partition function. In this way one has to resort to approximation schemes for the evaluation of the gradient. This is the case of Restricted Boltzmann Machines (RBM) and its learning algorithm Contrastive Divergence (CD). It is well-known that CD has a number of shortcomings, and its approximation to the gradient has several drawbacks. Overcoming these defects has been the basis of much research and new algorithms have been devised, such as persistent CD. In this manuscript we propose a new algorithm that we call Weighted CD (WCD), built from small modifications of the negative phase in standard CD. However small these modifications may be, experimental work reported in this paper suggests that WCD provides a significant improvement over standard CD and persistent CD at a small additional computational cost.