Generalized Stacked Sequential Learning

Puertas i Prats, Eloi

Generalized Stacked Sequential Learning

Over the past few decades, machine learning (ML) algorithms have become a very useful tool in tasks where designing and programming explicit, rule-based algorithms are infeasible. Some examples of applications where machine learning has been applied successfully are spam filtering, optical character...

Descripción completa

Detalles Bibliográficos
Autor:	Puertas i Prats, Eloi
Tipo de recurso:	tesis doctoral
Estado:	Versión publicada
Fecha de publicación:	2014
País:	España
Institución:	CBUC, CESCA
Repositorio:	TDR. Tesis Doctorales en Red
OAI Identifier:	oai:www.tdx.cat:10803/285969
Acceso en línea:	http://hdl.handle.net/10803/285969
Access Level:	acceso abierto
Palabra clave:	Intel·ligència artificial Inteligencia artificial Artificial intelligence Aprenentatge automàtic Aprendizaje automático Machine learning Visió per ordinador Visión por ordenador Computer vision Reconeixement de formes (Informàtica) Reconocimiento de formas (Informática) Pattern recognition systems Ciències Experimentals i Matemàtiques 51

id	ES_9d4776bbc4f01aee14e2180ece3bc830
oai_identifier_str	oai:www.tdx.cat:10803/285969
network_acronym_str	ES
network_name_str	España
repository_id_str
dc.title.none.fl_str_mv	Generalized Stacked Sequential Learning
title	Generalized Stacked Sequential Learning
spellingShingle	Generalized Stacked Sequential Learning Puertas i Prats, Eloi Intel·ligència artificial Inteligencia artificial Artificial intelligence Aprenentatge automàtic Aprendizaje automático Machine learning Visió per ordinador Visión por ordenador Computer vision Reconeixement de formes (Informàtica) Reconocimiento de formas (Informática) Pattern recognition systems Ciències Experimentals i Matemàtiques 51
title_short	Generalized Stacked Sequential Learning
title_full	Generalized Stacked Sequential Learning
title_fullStr	Generalized Stacked Sequential Learning
title_full_unstemmed	Generalized Stacked Sequential Learning
title_sort	Generalized Stacked Sequential Learning
dc.creator.none.fl_str_mv	Puertas i Prats, Eloi
author	Puertas i Prats, Eloi
author_facet	Puertas i Prats, Eloi
author_role	author
dc.contributor.none.fl_str_mv	Pujol Vila, Oriol Escalera Guerrero, Sergio Universitat de Barcelona. Departament de Matemàtica Aplicada i Anàlisi
dc.subject.none.fl_str_mv	Intel·ligència artificial Inteligencia artificial Artificial intelligence Aprenentatge automàtic Aprendizaje automático Machine learning Visió per ordinador Visión por ordenador Computer vision Reconeixement de formes (Informàtica) Reconocimiento de formas (Informática) Pattern recognition systems Ciències Experimentals i Matemàtiques 51
topic	Intel·ligència artificial Inteligencia artificial Artificial intelligence Aprenentatge automàtic Aprendizaje automático Machine learning Visió per ordinador Visión por ordenador Computer vision Reconeixement de formes (Informàtica) Reconocimiento de formas (Informática) Pattern recognition systems Ciències Experimentals i Matemàtiques 51
description	Over the past few decades, machine learning (ML) algorithms have become a very useful tool in tasks where designing and programming explicit, rule-based algorithms are infeasible. Some examples of applications where machine learning has been applied successfully are spam filtering, optical character recognition (OCR), search engines and computer vision. One of the most common tasks in ML is supervised learning, where the goal is to learn a general model able to predict the correct label of unseen examples from a set of known labeled input data. In supervised learning often it is assumed that data is independent and identically distributed (i.i.d ). This means that each sample in the data set has the same probability distribution as the others and all are mutually independent. However, classification problems in real world databases can break this i.i.d. assumption. For example, consider the case of object recognition in image understanding. In this case, if one pixel belongs to a certain object category, it is very likely that neighboring pixels also belong to the same object, with the exception of the borders. Another example is the case of a laughter detection application from voice records. A laugh has a clear pattern alternating voice and non-voice segments. Thus, discriminant information comes from the alternating pattern, and not just by the samples on their own. Another example can be found in the case of signature section recognition in an e-mail. In this case, the signature is usually found at the end of the mail, thus important discriminant information is found in the context. Another case is part-of-speech tagging in which each example describes a word that is categorized as noun, verb, adjective, etc. In this case it is very unlikely that patterns such as [verb, verb, adjective, verb] occur. All these applications present a common feature: the sequence/context of the labels matters. Sequential learning (25) breaks the i.i.d. assumption and assumes that samples are not independently drawn from a joint distribution of the data samples X and their labels Y . In sequential learning the training data actually consists of sequences of pairs (x, y), so that neighboring examples exhibit some kind of correlation. Usually sequential learning applications consider one-dimensional relationship support, but these types of relationships appear very frequently in other domains, such as images, or video. Sequential learning should not be confused with time series prediction. The main difference between both problems lays in the fact that sequential learning has access to the whole data set before any prediction is made and the full set of labels is to be provided at the same time. On the other hand, time series prediction has access to real labels up to the current time t and the goal is to predict the label at t + 1. Another related but different problem is sequence classification. In this case, the problem is to predict a single label for an input sequence. If we consider the image domain, the sequential learning goal is to classify the pixels of the image taking into account their context, while sequence classification is equivalent to classify one full image as one class. Sequential learning has been addressed from different perspectives: from the point of view of meta-learning by means of sliding window techniques, recurrent sliding windows or stacked sequential learning where the method is formulated as a combination of classifiers; or from the point of view of graphical models, using for example Hidden Markov Models or Conditional Random Fields. In this thesis, we are concerned with meta-learning strategies. Cohen et al. (17) showed that stacked sequential learning (SSL from now on) performed better than CRF and HMM on a subset of problems called “sequential partitioning problems”. These problems are characterized by long runs of identical labels. Moreover, SSL is computationally very efficient since it only needs to train two classifiers a constant number of times. Considering these benefits, we decided to explore in depth sequential learning using SSL and generalize the Cohen architecture to deal with a wider variety of problems.
publishDate	2014
dc.date.none.fl_str_mv	2014 2015 2015
dc.type.none.fl_str_mv	info:eu-repo/semantics/doctoralThesis info:eu-repo/semantics/publishedVersion
format	doctoralThesis
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://hdl.handle.net/10803/285969
url	http://hdl.handle.net/10803/285969
dc.language.none.fl_str_mv	Inglés
language_invalid_str_mv	Inglés
dc.rights.none.fl_str_mv	http://creativecommons.org/licenses/by-nc-nd/3.0/es/ info:eu-repo/semantics/openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc-nd/3.0/es/
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	118 p. application/pdf application/pdf
dc.publisher.none.fl_str_mv	Universitat de Barcelona
publisher.none.fl_str_mv	Universitat de Barcelona
dc.source.none.fl_str_mv	TDX (Tesis Doctorals en Xarxa) reponame:TDR. Tesis Doctorales en Red instname:CBUC, CESCA
instname_str	CBUC, CESCA
reponame_str	TDR. Tesis Doctorales en Red
collection	TDR. Tesis Doctorales en Red
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_	1869414729409101824
spelling	Generalized Stacked Sequential LearningPuertas i Prats, EloiIntel·ligència artificialInteligencia artificialArtificial intelligenceAprenentatge automàticAprendizaje automáticoMachine learningVisió per ordinadorVisión por ordenadorComputer visionReconeixement de formes (Informàtica)Reconocimiento de formas (Informática)Pattern recognition systemsCiències Experimentals i Matemàtiques51Over the past few decades, machine learning (ML) algorithms have become a very useful tool in tasks where designing and programming explicit, rule-based algorithms are infeasible. Some examples of applications where machine learning has been applied successfully are spam filtering, optical character recognition (OCR), search engines and computer vision. One of the most common tasks in ML is supervised learning, where the goal is to learn a general model able to predict the correct label of unseen examples from a set of known labeled input data. In supervised learning often it is assumed that data is independent and identically distributed (i.i.d ). This means that each sample in the data set has the same probability distribution as the others and all are mutually independent. However, classification problems in real world databases can break this i.i.d. assumption. For example, consider the case of object recognition in image understanding. In this case, if one pixel belongs to a certain object category, it is very likely that neighboring pixels also belong to the same object, with the exception of the borders. Another example is the case of a laughter detection application from voice records. A laugh has a clear pattern alternating voice and non-voice segments. Thus, discriminant information comes from the alternating pattern, and not just by the samples on their own. Another example can be found in the case of signature section recognition in an e-mail. In this case, the signature is usually found at the end of the mail, thus important discriminant information is found in the context. Another case is part-of-speech tagging in which each example describes a word that is categorized as noun, verb, adjective, etc. In this case it is very unlikely that patterns such as [verb, verb, adjective, verb] occur. All these applications present a common feature: the sequence/context of the labels matters. Sequential learning (25) breaks the i.i.d. assumption and assumes that samples are not independently drawn from a joint distribution of the data samples X and their labels Y . In sequential learning the training data actually consists of sequences of pairs (x, y), so that neighboring examples exhibit some kind of correlation. Usually sequential learning applications consider one-dimensional relationship support, but these types of relationships appear very frequently in other domains, such as images, or video. Sequential learning should not be confused with time series prediction. The main difference between both problems lays in the fact that sequential learning has access to the whole data set before any prediction is made and the full set of labels is to be provided at the same time. On the other hand, time series prediction has access to real labels up to the current time t and the goal is to predict the label at t + 1. Another related but different problem is sequence classification. In this case, the problem is to predict a single label for an input sequence. If we consider the image domain, the sequential learning goal is to classify the pixels of the image taking into account their context, while sequence classification is equivalent to classify one full image as one class. Sequential learning has been addressed from different perspectives: from the point of view of meta-learning by means of sliding window techniques, recurrent sliding windows or stacked sequential learning where the method is formulated as a combination of classifiers; or from the point of view of graphical models, using for example Hidden Markov Models or Conditional Random Fields. In this thesis, we are concerned with meta-learning strategies. Cohen et al. (17) showed that stacked sequential learning (SSL from now on) performed better than CRF and HMM on a subset of problems called “sequential partitioning problems”. These problems are characterized by long runs of identical labels. Moreover, SSL is computationally very efficient since it only needs to train two classifiers a constant number of times. Considering these benefits, we decided to explore in depth sequential learning using SSL and generalize the Cohen architecture to deal with a wider variety of problems.Universitat de BarcelonaPujol Vila, OriolEscalera Guerrero, SergioUniversitat de Barcelona. Departament de Matemàtica Aplicada i Anàlisi201520152014info:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/publishedVersion118 p.application/pdfapplication/pdfhttp://hdl.handle.net/10803/285969TDX (Tesis Doctorals en Xarxa)reponame:TDR. Tesis Doctorales en Redinstname:CBUC, CESCAInglésL'accés als continguts d'aquesta tesi queda condicionat a l'acceptació de les condicions d'ús establertes per la següent llicència Creative Commons: http://creativecommons.org/licenses/by-nc-nd/3.0/es/http://creativecommons.org/licenses/by-nc-nd/3.0/es/info:eu-repo/semantics/openAccessoai:www.tdx.cat:10803/2859692026-06-14T12:46:07Z
score	15,301629

Generalized Stacked Sequential Learning

Similares en LA Referencia