Análisis de técnicas de clasificación de perfiles taxonómicos para predecir trastornos de la enfermedad inflamatoria intestinal

The inflammatory bowel disease comprises a wide range of disorders with similar symptoms. Therefore, studying the bacteria present in the microbiota of patients is key for the diagnosis and treatment of these diseases. A thorough study of different available classification algorithms is crucial to f...

Descripción completa

Detalles Bibliográficos
Autor: Castillo Rosa, Eva
Tipo de recurso: tesis de maestría
Fecha de publicación:2020
País:España
Institución:Universitat Oberta de Catalunya (UOC)
Repositorio:O2, repositorio institucional de la UOC
OAI Identifier:oai:openaccess.uoc.edu:10609/121346
Acceso en línea:http://hdl.handle.net/10609/121346
Access Level:acceso abierto
Palabra clave:microbiota
machine learning
shiny
brillant
aprenentatge automàtic
brillante
aprendizaje automático
Bioinformatics -- TFM
Bioinformàtica -- TFM
Bioinformática -- TFM
Descripción
Sumario:The inflammatory bowel disease comprises a wide range of disorders with similar symptoms. Therefore, studying the bacteria present in the microbiota of patients is key for the diagnosis and treatment of these diseases. A thorough study of different available classification algorithms is crucial to find the optimal ones and apply it to the discovery of biomarkers or, ultimately, clinical diagnosis. In this study, the microbial diversity of biopsy samples from healthy, Crohn's disease or ulcerative colitis patients was analysed with QIIME 2 software. Various supervised machine learning methods have been applied from bacterial relative abundance data to sample classification. Finally, an interactive web application has been developed in order to adapt the optimal models to the user's input data. Although some linear models show similar performance to complex ones, the model with the highest performance is random forest. Besides, choosing a good dimensionality reduction method is important when applying machine learning on microbiome data. Just as crucial as making these analyses available to the entire scientific community, so that large-scale studies can be done.