3D scene reconstruction and understanding from single shot pictures

Augmented reality mixes computer generated graphics with real imaging using computer vision techniques. However, nowadays, augmented reality is still a very young field of research, and its applications usually involve predefi ned tags. This thesis has been directed to use computer vision and arti c...

Descripción completa

Detalles Bibliográficos
Autor: García González, Alfredo
Tipo de recurso: tesis de maestría
Fecha de publicación:2012
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2099.1/16427
Acceso en línea:https://hdl.handle.net/2099.1/16427
Access Level:acceso abierto
Palabra clave:Augmented reality
Computer vision
Machine learning
Three-dimensional imaging
Realitat augmentada
Visió per ordinador
Aprenentatge automàtic
Imatges tridimensionals
Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic
Descripción
Sumario:Augmented reality mixes computer generated graphics with real imaging using computer vision techniques. However, nowadays, augmented reality is still a very young field of research, and its applications usually involve predefi ned tags. This thesis has been directed to use computer vision and arti cial intelligence techniques to explore the viability of using natural landmarks as key points for computer graphics reference. Moreover, there are many techniques that infer 3D scenes from images like stereo-vision, structures from motion, depth images, shape from shading, etc. The aim of this work is to find a way of doing this from one single shot image. Finally, new virtual elements will be integrated on the final scene using contextual colors. The followed methodology has been to automatically segment an image in small planar surfaces using di fferent granularities of small regions. Each region is assumed to likely lie on only one planar surface, and thus it the 3D face that it came from can be inferred. The normal vector of the planes corresponding to the 3D faces are approximated along a discrete set of orientations. In addition, some regions do not have a regular orientation and thus, they are assumed as a texturized or porous region. Inferring the fi nal 3D orientation and location from the set of labelled regions is a non-trivial task. This work proposes a method based on the coherent topology of the neighborhood. The 3D position of each point of a region is found and a 3D scenario can be obtained. After that, the regions of the original images are textured in the 3D reconstructed faces. Finally, a color transfer approach is used to integrate new 3D objects inside the final scene.