@MASTERSTHESIS\{IMM2012-06376, author = "M. D. Olsen", title = "Stereo and Robot Navigation", year = "2012", school = "Technical University of Denmark, {DTU} Informatics, {E-}mail: reception@imm.dtu.dk", address = "Asmussens Alle, Building 305, {DK-}2800 Kgs. Lyngby, Denmark", type = "", note = "{DTU} supervisors: Henrik Aan{\ae}s, haa@imm.dtu.dk, and Anders Lindbjerg Dahl, abd@imm.dtu.dk, {DTU} Informatics. Thesis not publicly available.", url = "http://www.imm.dtu.dk/English.aspx", abstract = "Structure from Motion is not a newly discovered technique, for acquiring information about the world seen by a camera. Many people have looked into estimating the motion of a camera while reconstructing the unknown environment the camera sees. This is e.g. done by the company {CLAAS} Agrosystems where computer vision algorithms is used to extract information from images and used for robot guidance, quality control and object localization. In this thesis, the concept of Structure from Motion is studied, with the focus of applying it in an agriculture environment. However, the theory can easily be applied to other environments as well, as the input data are simple images, acquired with a camera. The differences of monocular and stereo based Structure from Motion will be explained and tested on a number of different data sets. Theory concerning disparity space, bundle adjustment and {RANSAC} will be described and used, in order to implement a full Structure from Motion setup. The final result will be an estimate of the motion of the camera within a sparse reconstructed {3D} world. In order to increase the detail of the reconstructed point cloud, dense matching techniques will be considered, where the goal is to do dense matching across multiple images. This will both include dense matching based on image rectification and disparity estimation; and another less constrained approach known as optical ow. The dense matching will be used for tracking objects/points over multiple frames, which in the end will result in a dense {3D} reconstruction, where objects such as a bale of straw can be localized with better accuracy, compared to the sparse {3D} reconstruction. Furthermore, the goal of combining cameras in a good manner is considered, as this should be used to do the features tracking over multiple frames." }