Characterization of anomalous diffusion through convolutional transformers

[EN] The results of the Anomalous Diffusion Challenge (AnDi Challenge) (Munoz-Gil G et al 2021 Nat. Commun. 12 6253) have shown that machine learning methods can outperform classical statistical methodology at the characterization of anomalous diffusion in both the inference of the anomalous diffusi...

Descripción completa

Detalles Bibliográficos
Autores: Firbas, Nicolás, Garibo i Orts, Óscar, Garcia March, Miguel Angel|||0000-0001-7092-838X, Conejero, J. Alberto|||0000-0003-3681-7533
Tipo de recurso: artículo
Fecha de publicación:2023
País:España
Institución:Universitat Politècnica de València (UPV)
Repositorio:RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
Idioma:inglés
OAI Identifier:oai:riunet.upv.es:10251/231877
Acceso en línea:https://riunet.upv.es/handle/10251/231877
Access Level:acceso abierto
Palabra clave:Anomalous diffusion
Machine learning
Recurrent neural networks
Convolutional networks
Transformers
Attention
Descripción
Sumario:[EN] The results of the Anomalous Diffusion Challenge (AnDi Challenge) (Munoz-Gil G et al 2021 Nat. Commun. 12 6253) have shown that machine learning methods can outperform classical statistical methodology at the characterization of anomalous diffusion in both the inference of the anomalous diffusion exponent alpha associated with each trajectory (Task 1), and the determination of the underlying diffusive regime which produced such trajectories (Task 2). Furthermore, of the five teams that finished in the top three across both tasks of the AnDi Challenge, three of those teams used recurrent neural networks (RNNs). While RNNs, like the long short-term memory network, are effective at learning long-term dependencies in sequential data, their key disadvantage is that they must be trained sequentially. In order to facilitate training with larger data sets, by training in parallel, we propose a new transformer based neural network architecture for the characterization of anomalous diffusion. Our new architecture, the Convolutional Transformer (ConvTransformer) uses a bi-layered convolutional neural network to extract features from our diffusive trajectories that can be thought of as being words in a sentence. These features are then fed to two transformer encoding blocks that perform either regression (Task 1 1D) or classification (Task 2 1D). To our knowledge, this is the first time transformers have been used for characterizing anomalous diffusion. Moreover, this may be the first time that a transformer encoding block has been used with a convolutional neural network and without the need for a transformer decoding block or positional encoding. Apart from being able to train in parallel, we show that the ConvTransformer is able to outperform the previous state of the art at determining the underlying diffusive regime (Task 2 1D) in short trajectories (length 10-50 steps), which are the most important for experimental researchers.