Bioactivity descriptors for uncharacterized chemical compounds

Chemical descriptors encode the physicochemical and structural properties of small molecules, and they are at the core of chemoinformatics. The broad release of bioactivity data has prompted enriched representations of compounds, reaching beyond chemical structures and capturing their known biologic...

Full description

Bibliographic Details
Authors: Bertoni, Martino, Duran-Frigola, Miquel, 1985-, Badia-I-Mompel, Pau, Pauls, Eduardo, Orozco, Modesto, Guitart Pla, Oriol, Alcalde, Víctor, Diaz, Víctor M., Berenguer-Llergo, Antonio, Brun-Heath, Isabelle, Villegas, Núria, García de Herreros, Antonio, Aloy, Patrick, 1972-
Format: article
Status:Published version
Publication Date:2021
Country:España
Institution:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
Repository:Recercat. Dipósit de la Recerca de Catalunya
OAI Identifier:oai:recercat.cat:10230/48319
Online Access:http://hdl.handle.net/10230/48319
http://dx.doi.org/10.1038/s41467-021-24150-4
Access Level:Open access
Keyword:Cheminformatics
Machine learning
Networks and systems biology
id ES_ff53cdcfd9916b04f89d574251bc2d61
oai_identifier_str oai:recercat.cat:10230/48319
network_acronym_str ES
network_name_str España
repository_id_str
spelling Bioactivity descriptors for uncharacterized chemical compoundsBertoni, MartinoDuran-Frigola, Miquel, 1985-Badia-I-Mompel, PauPauls, EduardoOrozco, ModestoGuitart Pla, OriolAlcalde, VíctorDiaz, Víctor M.Berenguer-Llergo, AntonioBrun-Heath, IsabelleVillegas, NúriaGarcía de Herreros, AntonioAloy, Patrick, 1972-CheminformaticsMachine learningNetworks and systems biologyChemical descriptors encode the physicochemical and structural properties of small molecules, and they are at the core of chemoinformatics. The broad release of bioactivity data has prompted enriched representations of compounds, reaching beyond chemical structures and capturing their known biological properties. Unfortunately, bioactivity descriptors are not available for most small molecules, which limits their applicability to a few thousand well characterized compounds. Here we present a collection of deep neural networks able to infer bioactivity signatures for any compound of interest, even when little or no experimental information is available for them. Our signaturizers relate to bioactivities of 25 different types (including target profiles, cellular response and clinical outcomes) and can be used as drop-in replacements for chemical descriptors in day-to-day chemoinformatics tasks. Indeed, we illustrate how inferred bioactivity signatures are useful to navigate the chemical space in a biologically relevant manner, unveiling higher-order organization in natural product collections, and to enrich mostly uncharacterized chemical libraries for activity against the drug-orphan target Snail1. Moreover, we implement a battery of signature-activity relationship (SigAR) models and show a substantial improvement in performance, with respect to chemistry-based classifiers, across a series of biophysics and physiology activity prediction benchmarks.We would like to thank the SB&NB lab members for their support and helpful discussions. We are grateful to T.O. Botelho, I. Ramos, and C. Gonzalez for giving us access to the IRB Barcelona and Prestwick libraries. P.A. acknowledges the support of the Generalitat de Catalunya (RIS3CAT Emergents CECH: 001-P-001682 and VEIS: 001-P-001647), the Spanish Ministerio de Economía y Competitividad (BIO2016-77038-R), the European Research Council (SysPharmAD: 614944), and the European Commission (RiPCoN: 101003633). A.G.d.H. acknowledges support by Agencia Estatal de Investigación (AEI) and Fondos FEDER (PID2019-104698RB-I00).Nature Research202120212021info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfapplication/pdfhttp://hdl.handle.net/10230/48319http://dx.doi.org/10.1038/s41467-021-24150-4reponame:Recercat. Dipósit de la Recerca de Catalunyainstname:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)InglésNat Commun. 2021;12(1):3932info:eu-repo/grantAgreement/EC/FP7/614944info:eu-repo/grantAgreement/ES/1PE/BIO2016-77038-R© The Author(s) 2021. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.http://creativecommons.org/licenses/by/4.0/info:eu-repo/semantics/openAccessoai:recercat.cat:10230/483192026-05-29T05:05:01Z
dc.title.none.fl_str_mv Bioactivity descriptors for uncharacterized chemical compounds
title Bioactivity descriptors for uncharacterized chemical compounds
spellingShingle Bioactivity descriptors for uncharacterized chemical compounds
Bertoni, Martino
Cheminformatics
Machine learning
Networks and systems biology
title_short Bioactivity descriptors for uncharacterized chemical compounds
title_full Bioactivity descriptors for uncharacterized chemical compounds
title_fullStr Bioactivity descriptors for uncharacterized chemical compounds
title_full_unstemmed Bioactivity descriptors for uncharacterized chemical compounds
title_sort Bioactivity descriptors for uncharacterized chemical compounds
dc.creator.none.fl_str_mv Bertoni, Martino
Duran-Frigola, Miquel, 1985-
Badia-I-Mompel, Pau
Pauls, Eduardo
Orozco, Modesto
Guitart Pla, Oriol
Alcalde, Víctor
Diaz, Víctor M.
Berenguer-Llergo, Antonio
Brun-Heath, Isabelle
Villegas, Núria
García de Herreros, Antonio
Aloy, Patrick, 1972-
author Bertoni, Martino
author_facet Bertoni, Martino
Duran-Frigola, Miquel, 1985-
Badia-I-Mompel, Pau
Pauls, Eduardo
Orozco, Modesto
Guitart Pla, Oriol
Alcalde, Víctor
Diaz, Víctor M.
Berenguer-Llergo, Antonio
Brun-Heath, Isabelle
Villegas, Núria
García de Herreros, Antonio
Aloy, Patrick, 1972-
author_role author
author2 Duran-Frigola, Miquel, 1985-
Badia-I-Mompel, Pau
Pauls, Eduardo
Orozco, Modesto
Guitart Pla, Oriol
Alcalde, Víctor
Diaz, Víctor M.
Berenguer-Llergo, Antonio
Brun-Heath, Isabelle
Villegas, Núria
García de Herreros, Antonio
Aloy, Patrick, 1972-
author2_role author
author
author
author
author
author
author
author
author
author
author
author
dc.subject.none.fl_str_mv Cheminformatics
Machine learning
Networks and systems biology
topic Cheminformatics
Machine learning
Networks and systems biology
description Chemical descriptors encode the physicochemical and structural properties of small molecules, and they are at the core of chemoinformatics. The broad release of bioactivity data has prompted enriched representations of compounds, reaching beyond chemical structures and capturing their known biological properties. Unfortunately, bioactivity descriptors are not available for most small molecules, which limits their applicability to a few thousand well characterized compounds. Here we present a collection of deep neural networks able to infer bioactivity signatures for any compound of interest, even when little or no experimental information is available for them. Our signaturizers relate to bioactivities of 25 different types (including target profiles, cellular response and clinical outcomes) and can be used as drop-in replacements for chemical descriptors in day-to-day chemoinformatics tasks. Indeed, we illustrate how inferred bioactivity signatures are useful to navigate the chemical space in a biologically relevant manner, unveiling higher-order organization in natural product collections, and to enrich mostly uncharacterized chemical libraries for activity against the drug-orphan target Snail1. Moreover, we implement a battery of signature-activity relationship (SigAR) models and show a substantial improvement in performance, with respect to chemistry-based classifiers, across a series of biophysics and physiology activity prediction benchmarks.
publishDate 2021
dc.date.none.fl_str_mv 2021
2021
2021
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/10230/48319
http://dx.doi.org/10.1038/s41467-021-24150-4
url http://hdl.handle.net/10230/48319
http://dx.doi.org/10.1038/s41467-021-24150-4
dc.language.none.fl_str_mv Inglés
language_invalid_str_mv Inglés
dc.relation.none.fl_str_mv Nat Commun. 2021;12(1):3932
info:eu-repo/grantAgreement/EC/FP7/614944
info:eu-repo/grantAgreement/ES/1PE/BIO2016-77038-R
dc.rights.none.fl_str_mv http://creativecommons.org/licenses/by/4.0/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by/4.0/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Nature Research
publisher.none.fl_str_mv Nature Research
dc.source.none.fl_str_mv reponame:Recercat. Dipósit de la Recerca de Catalunya
instname:Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
instname_str Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
reponame_str Recercat. Dipósit de la Recerca de Catalunya
collection Recercat. Dipósit de la Recerca de Catalunya
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1869425764561059840
score 15,81155