Universität Rostock, 2016
Abstract: The goal of this thesis was to create an architecture, which is able to both localize and recognize objects in a scene, while simultaneously scaling efficiently. Inspired by the way of the human perception, a model is designed to selectively focus its attention on different regions in an image and process them sequentially. By combining concepts from supervised and reinforcement learning, a method is created, which enables the architecture to be trained globally. The thesis ends with experiments on digit classification and license plate localisation.
master thesis free access