Bio-inspired event-driven intelligence for motion estimation

(English) Motion estimation problems can range from low degrees of freedom (DOF) ego-motion estimation to complex, high-DOF motion, which includes dense pixel displacement or optical flow. This information is essential for enabling robots to perceive and navigate their environments. However, existin...

Descripción completa

Detalles Bibliográficos
Autor: Tian, Yi
Tipo de recurso: tesis doctoral
Fecha de publicación:2025
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2117/430026
Acceso en línea:https://hdl.handle.net/2117/430026
https://dx.doi.org/10.5821/dissertation-2117-430026
Access Level:acceso abierto
Palabra clave:Event camera
Motion estimation
Spiking Neural Network (SNNs)
Ego-motion
Optical flow
004 - Informàtica
Àrees temàtiques de la UPC::Informàtica
Descripción
Sumario:(English) Motion estimation problems can range from low degrees of freedom (DOF) ego-motion estimation to complex, high-DOF motion, which includes dense pixel displacement or optical flow. This information is essential for enabling robots to perceive and navigate their environments. However, existing vision systems for motion estimation are less robust and efficient than biological systems, largely due to limitations in sensor technology and processing methods. This thesis builds on the bio-inspired sensor -event camera-, and the brain-inspired computing approach -Spiking Neural Networks (SNNs)-, presenting a promising solution that bridges these gaps. Event-based cameras have high temporal resolution, low latency, reduced data redundancy, and are power efficient. These unique capabilities make them particularly well-suited for environments and tasks where traditional frame-based cameras struggle. They show great potential for the solution of motion estimation problems across a wide range of applications, such as providing accurate and low-latency motion estimation for autonomous vehicles or aerial robots. SNNs are inspired by how neurons in the human brain communicate through synapses using spikes, which are brief and discrete electrical signals that allow highly efficient and robust information processing. The thesis begins with estimating 3-DOF ego-motion, progresses to sparse optical flow, and ultimately tackles dense optical flow. In the first step, the thesis addresses event-based ego-motion estimation by integrating SNN approaches with traditional optimization-based techniques. It explores the ego-motion estimation problem from inference optical flow obtained by an SNN and proposes a pooling method to address the aperture problem encountered in the sparse and noisy normal flow output of the SNN. In the next step, modern artificial neural network (ANN) architectures are leveraged to improve event-based optical flow estimation. This step proposes a U-Net transformer-based architecture with a recurrent neural network as the backbone. In the final phase of this research, the visual transformer architecture is further extended to flow encoders, incorporating spatiotemporal attention to enhance the extraction of temporal information. This led to the development of a swin transformer-based ANN model and its spiking counterpart. Notably, this work marks the first use of spikeformers in event-based optical flow estimation, demonstrating the potential of combining transformer architectures with SNNs for regression tasks. Overall, this thesis advances the understanding of motion estimation using event cameras. It sets the stage for their application in real-world scenarios such as high-speed object tracking and simultaneous localization and mapping (SLAM). The biologically inspired methods developed in this thesis offer promising avenues for balancing the performance and efficiency of computer vision and robotics systems, paving the way for future innovations in this field.