Classification of news categories using BERT

The present project consists of developing a Natural Language Processing model to classify news using a set of data or DataSets already evaluated. The main objective is to create a system that can automatically identify and assign news to one of the predefined categories: business, entertainment, po...

Descripción completa

Detalles Bibliográficos
Autores: Machado Medina, Bradlhy Luis, Santillana Quirita, César Alonso, Bautista Luque, Sharmelyn Violeta
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2023
País:Perú
Institución:Universidad La Salle
Repositorio:Revistas - Universidad La Salle
Idioma:español
OAI Identifier:oai:ojs.revistas.ulasalle.edu.pe:article/98
Acceso en línea:https://revistas.ulasalle.edu.pe/innosoft/article/view/98
https://doi.org/10.48168/innosoft.s12.a98
https://purl.org/42411/s12/a98
https://n2t.net/ark:/42411/s12/a98
Access Level:acceso abierto
Palabra clave:News classification
natural language processing
BERT
machine learning
artificial intelligence
clasificacion de noticias
procesamiento de lenguaje natural
inteligencia artificial
Descripción
Sumario:The present project consists of developing a Natural Language Processing model to classify news using a set of data or DataSets already evaluated. The main objective is to create a system that can automatically identify and assign news to one of the predefined categories: business, entertainment, politics, sports or technology. This involves data preprocessing, feature extraction, training a machinelearning model and then evaluating its performance using metrics such as "accuracy", "recall 2" F1 - score". This will allow to determine how well the model can predict the correct category for a new or unlabeled news item. If the performance of the model is satisfactory, it can be used to classify unlabeled news in real time. In summary, it seeks to provide an efficient and accurate solution for organizing and labeling the informative content of a news item with the help of Artificial Intelligence.