Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Ординатура / Офтальмология / Английские материалы / Binocular Vision Development, Depth Perception and Disorders_McCoun, Reeves_2010.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
9.88 Mб
Скачать

In: Binocular Vision

ISBN: 978-1-60876-547-8

Editors: J. McCoun et al, pp. 1-62

© 2010 Nova Science Publishers, Inc.

Chapter 1

NEW TRENDS IN SURFACE RECONSTRUCTION

USING SPACE-TIME CAMERAS:

FUSING STRUCTURE FROM MOTION,

SILHOUETTE, AND STEREO

Hossein Ebrahimnezhad1,a and Hassan Ghassemian2,b

1 Sahand University of Technology, Dept. of Electrical Engineering, Computer Vision Research Lab, Tabriz, Iran

2 Tarbiat Modaress University,

Dept. of Electrical and Computer Engineering, Tehran, Iran

Abstract

Three dimensional model reconstruction from image sequences has been extensively used in recent years. The most popular method is known as structure from motion, which employs feature and dense points matching to compute the motion and depth. This chapter is intended to present an overview of new trends in three dimensional model reconstruction using multiple views of object, which has been developed by the authors [43]. Robust curve matching method in stereo cameras for extraction of unique space curves is explained. Unique space curves are constructed from plane curves in stereo images based on curvature and torsion consistency. The shortcoming of outliers in motion estimation is

aE-mail address: ebrahimnezhad@sut.ac.ir,.

bE-mail address: ghaasemi@modares.ac.ir.

Web address: http://ee.sut.ac.ir/ showcvdetail.aspx?id=5

2

Hossein Ebrahimnezhad and Hassan Ghassemian

 

 

extremely reduced by employing the space curves. Besides, curve matching method deals with pixel range information and does not require the sub-pixel accuracy to compute structure and motion. Furthermore, it finds the correspondence based on curve shape and does not use any photometric information. This property makes the matching process very robust against the color and intensity maladjustment of stereo rigs. The recovered space curves are employed to estimate robust motion by minimizing the curve distance in the next sequence of stereo images. An efficient structure of stereo rigs – perpendicular double stereo – is presented to increase accuracy of motion estimation. Using the robust motion information, a set of exactly calibrated virtual cameras is constructed, which we call them space-time cameras. Then, the visual hull of object is extracted from intersection of silhouette cones of all virtual cameras. Finally, color information is mapped to the reconstructed surface by inverse projection from two dimensional image sets to three-dimensional space. All together, we introduce a complete automatic and practical system of threedimensional model reconstruction from raw images of arbitrarily moving object captured by fixed calibrated perpendicular double stereo rigs to surface representation. While, the simple methods of motion estimation suffer from the statistical bias due to quantization noise, measurement error, and outliers in the input data set; the complicated system overcomes the bias problem, by fusing several constraints, even in pixel-level information. Experimental results demonstrate the privileged performance of the complicated system for a variety of object shapes and textures.

Keywords: 3D model reconstruction; space-time cameras; perpendicular double stereo; structure from silhouette; structure from motion; space curves; unique points; visual hull.

1. Introduction

Reconstruction of surface model for a moving rigid object, through a sequence of photo images, is a challenging problem and an active research topic in computer vision. In recent years, there has been extensive focus in literature to recover three-dimensional structure and motion from image sequences [1-6]. Different types of algorithms are used because of the wide range of options, e.g., the image projection model, number of cameras and available views, availability of camera calibration, feature types and model of the scene. For a fixed object with a moving camera (or a moving rigid object with a fixed camera) setup, the shape and motion recovery problem can be formulated as trying to find out the 6 motion parameters of the object, e.g. its position and orientation displacement together with the accurate 3D world coordinates for each point. This problem is also known as bundle adjustment [7]. The standard method of rigid motion

[1, 2, and 14].

New Trends in Surface Reconstruction Using Space-Time Cameras

3

 

 

recovery has been developed in the last decade based on sparse feature points [8- 9]. The sparse method typically assumes that correspondences between scene features such as corners or surface creases have been established by tracking technique. It can compute only the traveling camera positions, and is not sufficient for modeling the object as it only reconstructs sparsely distributed 3D points. Typically, motion estimation methods suffer from instability due to quantization noise, measurement errors, and outliers in the input datasets. Outliers occur in the feature-matching process due mostly to occlusions. Different robust estimation techniques have been proposed to handle outliers. RANdom SAmple Consensus (RANSAC) is known as a successful technique to deal with outliers [10]. M- Estimators reduce the effects of outliers by applying the weighted leastsquares [11]. Many other similar methods also are available [12-13]. Another standard method of shape recovery from motion has been developed in the last decade based on optical flow

The process of structure and motion recovery usually consists of the minimization of some cost function. There are two dominant approaches to choose the cost function. The first approach is based on epipolar geometry leading to a decoupling of the shape and motion recovery. In epipolar constraint approach, the cost function reflects the amount of deviation of the epipolar constraint as made happen by noise and other measurement errors. In this method, the motion information can be achieved as the solution to a linear problem. Presence of statistical bias in estimating the translation [15-17] as well as the sensitivity to noise and pixel quantization is the conventional drawback, which makes additional error in linear solution. Even small pixel-level perturbations can make the image plane information ineffective and cause the wrong motion recovery. To improve the solution, some methods minimize the cost function using the nonlinear iterative methods like Levenberg-Marquardt algorithm [8]. Such methods are initialized with the output of the linear algorithms. The second approach directly minimizes the difference between observed and predicted feature coordinates using Levenberg-Marquardt algorithm [18]. This method is marked by a high-dimensional search space (typically n+6 for n image correspondences) and, unlike the epipolar constraint-based approach, it does not explicitly account for the fact that a one-parameter family of solutions exists.

In general, the structure and motion from monocular view image sequences is inherently a knotty problem and has its own restrictions, as the computations are very sensitive to noise and quantization of image points. Actually, the motion and structure computations are highly dependent to each other and any ambiguity in structure computation propagates to motion computation and vise versa. On the other hand, calibrated stereo vision directly computes the structure of feature

4

Hossein Ebrahimnezhad and Hassan Ghassemian

 

 

points. Therefore, integrating stereo and motion can reasonably improve the structure and motion procedure. Some works in literature fuse stereo and motion for rigid scene to get better results. Young et al. [19] computed the rigid motion parameters assuming that depth information had been computed already by stereo vision. Weng et al. [20] derived a closed form approximate matrix weighted least squares solution for motion parameters from three-dimensional point correspondences in two stereo image pairs. Li et al. [21] proposed a two-step fusing procedure: first, translational motion parameters were found from optical flows in binocular images, then the stereo correspondences were estimated with the knowledge of translational motion parameters. Dornaika et al. [22] recovered the stereo correspondence using motion of a stereo rig in two consecutive steps. The first step uses metric data associated with the stereo rig while the second step employs feature correspondences only. Ho et al. [23] combined stereo and motion analyses for three-dimensional reconstruction when a mobile platform was captured with two fixed cameras. Park et. al [24] estimated the object motion directly through the calibrated stereo image sequences. Although, the combination form of motion and stereo enhances the computations [25], presence of statistical bias in estimating the motion parameters still has destructive effect in structure and motion estimation procedure.

In this chapter, a constructive method is presented to moderate the bias problem using curve based stereo matching and robust motion estimation by tracking the projection of space curves in perpendicular double stereo images. We prove mathematically and demonstrate experimentally that the presented method can increase motion estimation accuracy and reduce the problem of statistical bias. Moreover, the perpendicular double stereo setup appears to be more robust against the perturbation of edge points. Any large error in depth direction of stereo rig 1 is restricted by minimizing the error in parallel direction of stereo rig 2 and vice versa. In addition, the curve-matching scheme is very robust against the color maladjustment of cameras and shading problem during object motion.

In section 2, a robust edge point correspondence with match propagation along the curves is presented to extract unique space curves with extremely reduced number of outliers. In section 3, two sets of space curves, which have been extracted from two distinct stereo rigs, are used to estimate object motion in sequential frames. A space curve-tracking algorithm is presented by minimizing the geometric distance of moving curves in camera planes. The proposed curvetracking method works as well with pixel accuracy information and does not require the complicated process of position computation in sub-pixel accuracy. An efficient structure of stereo setup - the perpendicular double stereo - is presented to get as much accuracy as possible in motion estimation process. Its properties

New Trends in Surface Reconstruction Using Space-Time Cameras

5

 

 

are discussed and proven mathematically. In section 4, a set of calibrated virtual cameras are constructed from motion information. This goal is achieved by assuming the moving object as a fixed object and the fixed camera as a moving camera in opposite direction. To utilize the benefits of multiview reconstruction, the moving object is supposed to be fixed and the camera is moved in the opposite direction. So, the new virtual cameras are constructed as the real calibrated cameras around the object. In section 5, object's visual hull is recovered as fine as possible by intersecting the large number of cones established by silhouettes of multiple views. A hierarchical method is presented to extract the visual hull of the object as bounding edges. In section 6, experimental results with both synthetic and real objects are presented. We conclude in section 7 with a brief discussion. The total procedure of three dimensional model reconstruction from raw images to fine 3d model is done in the following steps:

Step1Extraction of unique space curves on the surface of rigid object from calibrated stereo image information based on curvature and torsion consistency of established space curves during rigid motion.

Step2Object tracking and rigid motion estimation by curve distance minimization in projection of space curves to the image planes of perpendicular double stereo rigs

Step3Making virtual calibrated cameras as many as required from fixed real camera information and rigid motion information

Step4Reconstruction of the object's visual hull by intersecting the cones originated from silhouettes of object in virtual cameras across time (Space-Time Cameras)

Step5Texture mapping to any point of visual hull through the visible virtual cameras

2.Reconstruction of Space Curves on the Surface of Object

The problem of decision on the correspondence is the main obscurity of stereo vision. Components in the left image should be matched to those of the right one to compute disparity and thus depth. Several constraints such as intensity correlation, epipolar geometry, ordering, depth continuity and local orientation differences have been proposed to improve the matching process, but their success has been limited. There are many situations, where it is not possible to find point-like features as corners or wedges. Then there is need to deal e.g.