Study of a Lifelong Learning Scenario for Question Answering

Question Answering (QA) systems have witnessed a significant advance in the last years due to the development of neural architectures employing pre-trained large models like BERT. However, once the QA model is fine-tuned for a task (e.g a particular type of questions over a particular domain), syste...

Full description

Bibliographic Details
Authors: Echegoyen, Guillermo, Rodrigo Yuste, Álvaro, Peñas Padilla, Anselmo
Format: article
Publication Date:2022
Country:España
Institution:Universidad Nacional de Educación a Distancia
Repository:e-spacio. Repositorio Institucional de la UNED
Language:English
OAI Identifier:oai:e-spacio.uned.es:20.500.14468/23829
Online Access:https://hdl.handle.net/20.500.14468/23829
Access Level:Open access
Keyword:12 Matemáticas::1203 Ciencia de los ordenadores ::1203.17 Informática
Question Answering
Lifelong Learning
Transfer Learning
Deep Learning
Description
Summary:Question Answering (QA) systems have witnessed a significant advance in the last years due to the development of neural architectures employing pre-trained large models like BERT. However, once the QA model is fine-tuned for a task (e.g a particular type of questions over a particular domain), system performance drops when new tasks are added along time, (e.g new types of questions or new domains). Therefore, the system requires a retraining but, since the data distribution has shifted away from the previous learning, performance over previous tasks drops significantly. Hence, we need strategies to make our systems resistant to the passage of time. Lifelong Learning (LL) aims to study how systems can take advantage of the previous learning and the knowledge acquired to maintain or improve performance over time. In this article, we explore a scenario where the same LL based QA system suffers along time several shifts in the data distribution, represented as the addition of new different QA datasets. In this setup, the following research questions arise: (i) How LL based QA systems can benefit from previously learned tasks? (ii) Is there any strategy general enough to maintain or improve the performance over time when new tasks are added? and finally, (iii) How to detect a lack of knowledge that impedes the answering of questions and must trigger a new learning process? To answer these questions, we systematically try all possible training sequencesover three well known QA datasets. Our results show how the learning of a new dataset is sensitive to previous training sequences and that we can find a strategy general enough to avoid the combinatorial explosion of testing all possible training sequences. Thus, when a new dataset is added to the system, the best way to retrain the system without dropping performance over the previous datasets is to randomly merge the new training material with the previous one.