
- •Preface
- •Biological Vision Systems
- •Visual Representations from Paintings to Photographs
- •Computer Vision
- •The Limitations of Standard 2D Images
- •3D Imaging, Analysis and Applications
- •Book Objective and Content
- •Acknowledgements
- •Contents
- •Contributors
- •2.1 Introduction
- •Chapter Outline
- •2.2 An Overview of Passive 3D Imaging Systems
- •2.2.1 Multiple View Approaches
- •2.2.2 Single View Approaches
- •2.3 Camera Modeling
- •2.3.1 Homogeneous Coordinates
- •2.3.2 Perspective Projection Camera Model
- •2.3.2.1 Camera Modeling: The Coordinate Transformation
- •2.3.2.2 Camera Modeling: Perspective Projection
- •2.3.2.3 Camera Modeling: Image Sampling
- •2.3.2.4 Camera Modeling: Concatenating the Projective Mappings
- •2.3.3 Radial Distortion
- •2.4 Camera Calibration
- •2.4.1 Estimation of a Scene-to-Image Planar Homography
- •2.4.2 Basic Calibration
- •2.4.3 Refined Calibration
- •2.4.4 Calibration of a Stereo Rig
- •2.5 Two-View Geometry
- •2.5.1 Epipolar Geometry
- •2.5.2 Essential and Fundamental Matrices
- •2.5.3 The Fundamental Matrix for Pure Translation
- •2.5.4 Computation of the Fundamental Matrix
- •2.5.5 Two Views Separated by a Pure Rotation
- •2.5.6 Two Views of a Planar Scene
- •2.6 Rectification
- •2.6.1 Rectification with Calibration Information
- •2.6.2 Rectification Without Calibration Information
- •2.7 Finding Correspondences
- •2.7.1 Correlation-Based Methods
- •2.7.2 Feature-Based Methods
- •2.8 3D Reconstruction
- •2.8.1 Stereo
- •2.8.1.1 Dense Stereo Matching
- •2.8.1.2 Triangulation
- •2.8.2 Structure from Motion
- •2.9 Passive Multiple-View 3D Imaging Systems
- •2.9.1 Stereo Cameras
- •2.9.2 3D Modeling
- •2.9.3 Mobile Robot Localization and Mapping
- •2.10 Passive Versus Active 3D Imaging Systems
- •2.11 Concluding Remarks
- •2.12 Further Reading
- •2.13 Questions
- •2.14 Exercises
- •References
- •3.1 Introduction
- •3.1.1 Historical Context
- •3.1.2 Basic Measurement Principles
- •3.1.3 Active Triangulation-Based Methods
- •3.1.4 Chapter Outline
- •3.2 Spot Scanners
- •3.2.1 Spot Position Detection
- •3.3 Stripe Scanners
- •3.3.1 Camera Model
- •3.3.2 Sheet-of-Light Projector Model
- •3.3.3 Triangulation for Stripe Scanners
- •3.4 Area-Based Structured Light Systems
- •3.4.1 Gray Code Methods
- •3.4.1.1 Decoding of Binary Fringe-Based Codes
- •3.4.1.2 Advantage of the Gray Code
- •3.4.2 Phase Shift Methods
- •3.4.2.1 Removing the Phase Ambiguity
- •3.4.3 Triangulation for a Structured Light System
- •3.5 System Calibration
- •3.6 Measurement Uncertainty
- •3.6.1 Uncertainty Related to the Phase Shift Algorithm
- •3.6.2 Uncertainty Related to Intrinsic Parameters
- •3.6.3 Uncertainty Related to Extrinsic Parameters
- •3.6.4 Uncertainty as a Design Tool
- •3.7 Experimental Characterization of 3D Imaging Systems
- •3.7.1 Low-Level Characterization
- •3.7.2 System-Level Characterization
- •3.7.3 Characterization of Errors Caused by Surface Properties
- •3.7.4 Application-Based Characterization
- •3.8 Selected Advanced Topics
- •3.8.1 Thin Lens Equation
- •3.8.2 Depth of Field
- •3.8.3 Scheimpflug Condition
- •3.8.4 Speckle and Uncertainty
- •3.8.5 Laser Depth of Field
- •3.8.6 Lateral Resolution
- •3.9 Research Challenges
- •3.10 Concluding Remarks
- •3.11 Further Reading
- •3.12 Questions
- •3.13 Exercises
- •References
- •4.1 Introduction
- •Chapter Outline
- •4.2 Representation of 3D Data
- •4.2.1 Raw Data
- •4.2.1.1 Point Cloud
- •4.2.1.2 Structured Point Cloud
- •4.2.1.3 Depth Maps and Range Images
- •4.2.1.4 Needle map
- •4.2.1.5 Polygon Soup
- •4.2.2 Surface Representations
- •4.2.2.1 Triangular Mesh
- •4.2.2.2 Quadrilateral Mesh
- •4.2.2.3 Subdivision Surfaces
- •4.2.2.4 Morphable Model
- •4.2.2.5 Implicit Surface
- •4.2.2.6 Parametric Surface
- •4.2.2.7 Comparison of Surface Representations
- •4.2.3 Solid-Based Representations
- •4.2.3.1 Voxels
- •4.2.3.3 Binary Space Partitioning
- •4.2.3.4 Constructive Solid Geometry
- •4.2.3.5 Boundary Representations
- •4.2.4 Summary of Solid-Based Representations
- •4.3 Polygon Meshes
- •4.3.1 Mesh Storage
- •4.3.2 Mesh Data Structures
- •4.3.2.1 Halfedge Structure
- •4.4 Subdivision Surfaces
- •4.4.1 Doo-Sabin Scheme
- •4.4.2 Catmull-Clark Scheme
- •4.4.3 Loop Scheme
- •4.5 Local Differential Properties
- •4.5.1 Surface Normals
- •4.5.2 Differential Coordinates and the Mesh Laplacian
- •4.6 Compression and Levels of Detail
- •4.6.1 Mesh Simplification
- •4.6.1.1 Edge Collapse
- •4.6.1.2 Quadric Error Metric
- •4.6.2 QEM Simplification Summary
- •4.6.3 Surface Simplification Results
- •4.7 Visualization
- •4.8 Research Challenges
- •4.9 Concluding Remarks
- •4.10 Further Reading
- •4.11 Questions
- •4.12 Exercises
- •References
- •1.1 Introduction
- •Chapter Outline
- •1.2 A Historical Perspective on 3D Imaging
- •1.2.1 Image Formation and Image Capture
- •1.2.2 Binocular Perception of Depth
- •1.2.3 Stereoscopic Displays
- •1.3 The Development of Computer Vision
- •1.3.1 Further Reading in Computer Vision
- •1.4 Acquisition Techniques for 3D Imaging
- •1.4.1 Passive 3D Imaging
- •1.4.2 Active 3D Imaging
- •1.4.3 Passive Stereo Versus Active Stereo Imaging
- •1.5 Twelve Milestones in 3D Imaging and Shape Analysis
- •1.5.1 Active 3D Imaging: An Early Optical Triangulation System
- •1.5.2 Passive 3D Imaging: An Early Stereo System
- •1.5.3 Passive 3D Imaging: The Essential Matrix
- •1.5.4 Model Fitting: The RANSAC Approach to Feature Correspondence Analysis
- •1.5.5 Active 3D Imaging: Advances in Scanning Geometries
- •1.5.6 3D Registration: Rigid Transformation Estimation from 3D Correspondences
- •1.5.7 3D Registration: Iterative Closest Points
- •1.5.9 3D Local Shape Descriptors: Spin Images
- •1.5.10 Passive 3D Imaging: Flexible Camera Calibration
- •1.5.11 3D Shape Matching: Heat Kernel Signatures
- •1.6 Applications of 3D Imaging
- •1.7 Book Outline
- •1.7.1 Part I: 3D Imaging and Shape Representation
- •1.7.2 Part II: 3D Shape Analysis and Processing
- •1.7.3 Part III: 3D Imaging Applications
- •References
- •5.1 Introduction
- •5.1.1 Applications
- •5.1.2 Chapter Outline
- •5.2 Mathematical Background
- •5.2.1 Differential Geometry
- •5.2.2 Curvature of Two-Dimensional Surfaces
- •5.2.3 Discrete Differential Geometry
- •5.2.4 Diffusion Geometry
- •5.2.5 Discrete Diffusion Geometry
- •5.3 Feature Detectors
- •5.3.1 A Taxonomy
- •5.3.2 Harris 3D
- •5.3.3 Mesh DOG
- •5.3.4 Salient Features
- •5.3.5 Heat Kernel Features
- •5.3.6 Topological Features
- •5.3.7 Maximally Stable Components
- •5.3.8 Benchmarks
- •5.4 Feature Descriptors
- •5.4.1 A Taxonomy
- •5.4.2 Curvature-Based Descriptors (HK and SC)
- •5.4.3 Spin Images
- •5.4.4 Shape Context
- •5.4.5 Integral Volume Descriptor
- •5.4.6 Mesh Histogram of Gradients (HOG)
- •5.4.7 Heat Kernel Signature (HKS)
- •5.4.8 Scale-Invariant Heat Kernel Signature (SI-HKS)
- •5.4.9 Color Heat Kernel Signature (CHKS)
- •5.4.10 Volumetric Heat Kernel Signature (VHKS)
- •5.5 Research Challenges
- •5.6 Conclusions
- •5.7 Further Reading
- •5.8 Questions
- •5.9 Exercises
- •References
- •6.1 Introduction
- •Chapter Outline
- •6.2 Registration of Two Views
- •6.2.1 Problem Statement
- •6.2.2 The Iterative Closest Points (ICP) Algorithm
- •6.2.3 ICP Extensions
- •6.2.3.1 Techniques for Pre-alignment
- •Global Approaches
- •Local Approaches
- •6.2.3.2 Techniques for Improving Speed
- •Subsampling
- •Closest Point Computation
- •Distance Formulation
- •6.2.3.3 Techniques for Improving Accuracy
- •Outlier Rejection
- •Additional Information
- •Probabilistic Methods
- •6.3 Advanced Techniques
- •6.3.1 Registration of More than Two Views
- •Reducing Error Accumulation
- •Automating Registration
- •6.3.2 Registration in Cluttered Scenes
- •Point Signatures
- •Matching Methods
- •6.3.3 Deformable Registration
- •Methods Based on General Optimization Techniques
- •Probabilistic Methods
- •6.3.4 Machine Learning Techniques
- •Improving the Matching
- •Object Detection
- •6.4 Quantitative Performance Evaluation
- •6.5 Case Study 1: Pairwise Alignment with Outlier Rejection
- •6.6 Case Study 2: ICP with Levenberg-Marquardt
- •6.6.1 The LM-ICP Method
- •6.6.2 Computing the Derivatives
- •6.6.3 The Case of Quaternions
- •6.6.4 Summary of the LM-ICP Algorithm
- •6.6.5 Results and Discussion
- •6.7 Case Study 3: Deformable ICP with Levenberg-Marquardt
- •6.7.1 Surface Representation
- •6.7.2 Cost Function
- •Data Term: Global Surface Attraction
- •Data Term: Boundary Attraction
- •Penalty Term: Spatial Smoothness
- •Penalty Term: Temporal Smoothness
- •6.7.3 Minimization Procedure
- •6.7.4 Summary of the Algorithm
- •6.7.5 Experiments
- •6.8 Research Challenges
- •6.9 Concluding Remarks
- •6.10 Further Reading
- •6.11 Questions
- •6.12 Exercises
- •References
- •7.1 Introduction
- •7.1.1 Retrieval and Recognition Evaluation
- •7.1.2 Chapter Outline
- •7.2 Literature Review
- •7.3 3D Shape Retrieval Techniques
- •7.3.1 Depth-Buffer Descriptor
- •7.3.1.1 Computing the 2D Projections
- •7.3.1.2 Obtaining the Feature Vector
- •7.3.1.3 Evaluation
- •7.3.1.4 Complexity Analysis
- •7.3.2 Spin Images for Object Recognition
- •7.3.2.1 Matching
- •7.3.2.2 Evaluation
- •7.3.2.3 Complexity Analysis
- •7.3.3 Salient Spectral Geometric Features
- •7.3.3.1 Feature Points Detection
- •7.3.3.2 Local Descriptors
- •7.3.3.3 Shape Matching
- •7.3.3.4 Evaluation
- •7.3.3.5 Complexity Analysis
- •7.3.4 Heat Kernel Signatures
- •7.3.4.1 Evaluation
- •7.3.4.2 Complexity Analysis
- •7.4 Research Challenges
- •7.5 Concluding Remarks
- •7.6 Further Reading
- •7.7 Questions
- •7.8 Exercises
- •References
- •8.1 Introduction
- •Chapter Outline
- •8.2 3D Face Scan Representation and Visualization
- •8.3 3D Face Datasets
- •8.3.1 FRGC v2 3D Face Dataset
- •8.3.2 The Bosphorus Dataset
- •8.4 3D Face Recognition Evaluation
- •8.4.1 Face Verification
- •8.4.2 Face Identification
- •8.5 Processing Stages in 3D Face Recognition
- •8.5.1 Face Detection and Segmentation
- •8.5.2 Removal of Spikes
- •8.5.3 Filling of Holes and Missing Data
- •8.5.4 Removal of Noise
- •8.5.5 Fiducial Point Localization and Pose Correction
- •8.5.6 Spatial Resampling
- •8.5.7 Feature Extraction on Facial Surfaces
- •8.5.8 Classifiers for 3D Face Matching
- •8.6 ICP-Based 3D Face Recognition
- •8.6.1 ICP Outline
- •8.6.2 A Critical Discussion of ICP
- •8.6.3 A Typical ICP-Based 3D Face Recognition Implementation
- •8.6.4 ICP Variants and Other Surface Registration Approaches
- •8.7 PCA-Based 3D Face Recognition
- •8.7.1 PCA System Training
- •8.7.2 PCA Training Using Singular Value Decomposition
- •8.7.3 PCA Testing
- •8.7.4 PCA Performance
- •8.8 LDA-Based 3D Face Recognition
- •8.8.1 Two-Class LDA
- •8.8.2 LDA with More than Two Classes
- •8.8.3 LDA in High Dimensional 3D Face Spaces
- •8.8.4 LDA Performance
- •8.9 Normals and Curvature in 3D Face Recognition
- •8.9.1 Computing Curvature on a 3D Face Scan
- •8.10 Recent Techniques in 3D Face Recognition
- •8.10.1 3D Face Recognition Using Annotated Face Models (AFM)
- •8.10.2 Local Feature-Based 3D Face Recognition
- •8.10.2.1 Keypoint Detection and Local Feature Matching
- •8.10.2.2 Other Local Feature-Based Methods
- •8.10.3 Expression Modeling for Invariant 3D Face Recognition
- •8.10.3.1 Other Expression Modeling Approaches
- •8.11 Research Challenges
- •8.12 Concluding Remarks
- •8.13 Further Reading
- •8.14 Questions
- •8.15 Exercises
- •References
- •9.1 Introduction
- •Chapter Outline
- •9.2 DEM Generation from Stereoscopic Imagery
- •9.2.1 Stereoscopic DEM Generation: Literature Review
- •9.2.2 Accuracy Evaluation of DEMs
- •9.2.3 An Example of DEM Generation from SPOT-5 Imagery
- •9.3 DEM Generation from InSAR
- •9.3.1 Techniques for DEM Generation from InSAR
- •9.3.1.1 Basic Principle of InSAR in Elevation Measurement
- •9.3.1.2 Processing Stages of DEM Generation from InSAR
- •The Branch-Cut Method of Phase Unwrapping
- •The Least Squares (LS) Method of Phase Unwrapping
- •9.3.2 Accuracy Analysis of DEMs Generated from InSAR
- •9.3.3 Examples of DEM Generation from InSAR
- •9.4 DEM Generation from LIDAR
- •9.4.1 LIDAR Data Acquisition
- •9.4.2 Accuracy, Error Types and Countermeasures
- •9.4.3 LIDAR Interpolation
- •9.4.4 LIDAR Filtering
- •9.4.5 DTM from Statistical Properties of the Point Cloud
- •9.5 Research Challenges
- •9.6 Concluding Remarks
- •9.7 Further Reading
- •9.8 Questions
- •9.9 Exercises
- •References
- •10.1 Introduction
- •10.1.1 Allometric Modeling of Biomass
- •10.1.2 Chapter Outline
- •10.2 Aerial Photo Mensuration
- •10.2.1 Principles of Aerial Photogrammetry
- •10.2.1.1 Geometric Basis of Photogrammetric Measurement
- •10.2.1.2 Ground Control and Direct Georeferencing
- •10.2.2 Tree Height Measurement Using Forest Photogrammetry
- •10.2.2.2 Automated Methods in Forest Photogrammetry
- •10.3 Airborne Laser Scanning
- •10.3.1 Principles of Airborne Laser Scanning
- •10.3.1.1 Lidar-Based Measurement of Terrain and Canopy Surfaces
- •10.3.2 Individual Tree-Level Measurement Using Lidar
- •10.3.2.1 Automated Individual Tree Measurement Using Lidar
- •10.3.3 Area-Based Approach to Estimating Biomass with Lidar
- •10.4 Future Developments
- •10.5 Concluding Remarks
- •10.6 Further Reading
- •10.7 Questions
- •References
- •11.1 Introduction
- •Chapter Outline
- •11.2 Volumetric Data Acquisition
- •11.2.1 Computed Tomography
- •11.2.1.1 Characteristics of 3D CT Data
- •11.2.2 Positron Emission Tomography (PET)
- •11.2.2.1 Characteristics of 3D PET Data
- •Relaxation
- •11.2.3.1 Characteristics of the 3D MRI Data
- •Image Quality and Artifacts
- •11.2.4 Summary
- •11.3 Surface Extraction and Volumetric Visualization
- •11.3.1 Surface Extraction
- •Example: Curvatures and Geometric Tools
- •11.3.2 Volume Rendering
- •11.3.3 Summary
- •11.4 Volumetric Image Registration
- •11.4.1 A Hierarchy of Transformations
- •11.4.1.1 Rigid Body Transformation
- •11.4.1.2 Similarity Transformations and Anisotropic Scaling
- •11.4.1.3 Affine Transformations
- •11.4.1.4 Perspective Transformations
- •11.4.1.5 Non-rigid Transformations
- •11.4.2 Points and Features Used for the Registration
- •11.4.2.1 Landmark Features
- •11.4.2.2 Surface-Based Registration
- •11.4.2.3 Intensity-Based Registration
- •11.4.3 Registration Optimization
- •11.4.3.1 Estimation of Registration Errors
- •11.4.4 Summary
- •11.5 Segmentation
- •11.5.1 Semi-automatic Methods
- •11.5.1.1 Thresholding
- •11.5.1.2 Region Growing
- •11.5.1.3 Deformable Models
- •Snakes
- •Balloons
- •11.5.2 Fully Automatic Methods
- •11.5.2.1 Atlas-Based Segmentation
- •11.5.2.2 Statistical Shape Modeling and Analysis
- •11.5.3 Summary
- •11.6 Diffusion Imaging: An Illustration of a Full Pipeline
- •11.6.1 From Scalar Images to Tensors
- •11.6.2 From Tensor Image to Information
- •11.6.3 Summary
- •11.7 Applications
- •11.7.1 Diagnosis and Morphometry
- •11.7.2 Simulation and Training
- •11.7.3 Surgical Planning and Guidance
- •11.7.4 Summary
- •11.8 Concluding Remarks
- •11.9 Research Challenges
- •11.10 Further Reading
- •Data Acquisition
- •Surface Extraction
- •Volume Registration
- •Segmentation
- •Diffusion Imaging
- •Software
- •11.11 Questions
- •11.12 Exercises
- •References
- •Index
Какую работу нужно написать?

3D Imaging, Analysis and Applications

Nick Pears Yonghuai Liu Peter Bunting
Editors
3D Imaging,
Analysis and
Applications
Editors |
|
Nick Pears |
Peter Bunting |
Department of Computer Science |
Institute of Geography and Earth Sciences |
University of York |
Aberystwyth University |
York, UK |
Aberystwyth, UK |
Yonghuai Liu |
|
Institute of Geography and Earth Sciences |
|
Aberystwyth University |
|
Aberystwyth, UK |
|
ISBN 978-1-4471-4062-7 ISBN 978-1-4471-4063-4 (eBook)
DOI 10.1007/978-1-4471-4063-4
Springer London Heidelberg New York Dordrecht
Library of Congress Control Number: 2012939510
© Springer-Verlag London 2012
Chapter 3—Active 3D Imaging Systems, pp. 95–138 © Her Majesty the Queen in Right of Canada 2012 Chapter 10—High-Resolution Three-Dimensional Remote Sensing for Forest Measurement, pp. 417–443 © Springer-Verlag London (outside the USA) 2012
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)

Dr. Philip Batchehor (1967–2011)
Mathematicians are born, not made
Jules Henri Poincaré
This book is dedicated to the memory of Dr. Philip Batchelor, who tragically died in a climbing accident while close to completing work on Chapter 11. Born in Cornwall in England but brought up in Vouvry in Switzerland he was always proud of both sides to his heritage. A truly gifted mathematician, he devoted his talents to medical imaging, where he made many valuable contributions. He discovered a matrix formalization of motion in MRI, which inspired the field of non-rigid MRI motion correction. In diffusion imaging, he provided a mathematical framework for the treatment of tensors, leading to the geodesic anisotropy measure. More recently, he and co-workers made significant progress in tackling the hugely difficult task of diffusion tensor imaging of the beating heart. A popular senior lecturer in Imaging Sciences at King’s College London, he was highly valued for his teaching as well as his research. He always had time for students, colleagues and anyone who sought his advice. His insight into problems in three dimensions and higher benefited many who were lucky enough to work with him. He would have been delighted to pass on some of his knowledge in this book. Beyond his scientific abilities, it is his warmth, generosity of spirit and consistent smile that will be remembered with affection by colleagues, friends and family alike.
Preface
This book is primarily a graduate text on 3D imaging, shape analysis and associated applications. In addition to serving masters-level and doctoral-level research students, much of the text is accessible enough to be useful to final-year undergraduate project students. Also, we hope that it will serve wider audiences; for example, as a reference text for professional academics, people working in commercial research and development labs and industrial practitioners.
We believe that this text is unique in the literature on 3D imaging in several respects: (1) it provides a wide coverage of topics in 3D imaging; (2) it pays special attention to the clear presentation of well-established core techniques; (3) it covers a wide range of the most promising recent techniques that are considered to be state- of-the-art; (4) it is the first time that so many world-leading academics have come together on this range of 3D imaging topics.
Firstly, we ask: why is 3D imaging so interesting to study, research and develop? Why is it so important to our society, economy and culture? In due course, we will answer these questions but, to provide a wide context, we need consider the development of biological vision systems, the development of visual representations and the role of digital images in today’s information society. This leads us to the field of computer vision and ultimately to the subject of this book: 3D imaging.
Biological Vision Systems
It is worth reflecting on the obvious fact that the importance of images, in their very general sense (paintings, digital photographs, etc.), is rooted in our ability to see. The ability to sense light dates back to at least the Cambrian period, over half a billion years ago, as evidenced by fossils of trilobites’ compound eyes in the Burgess shale. For many animals, including humans, the eye-brain vision system constitutes an indispensable, highly information-rich mode of sensory perception. The evolution of this system has been driven by aspects of our particular optical world: with the Sun we have a natural source of light; air is transparent, which allows our environment to be illuminated; the wavelength of light is small enough to be scattered by
vii
viii |
Preface |
most surfaces, which means that we can gather light reflected off them from many different viewpoints, and it is possible to form an image with a relatively simple structure such as a convex lens. It is easy to appreciate the utility and advantages of visual perception, which provides a means to sense the environment in a comprehensive, non-contact way. Indeed, evolutionary biologists have proposed that the development of vision intensified predation and sparked an evolutionary arms race between predators and prey. Thus biological vision systems exist as a result of evolutionary survival and have been present on Earth for a very long time.
Today, there is comprehensive research literature in visual psychophysics, the study of eyesight. Visual illusions suggest that, for some 3D cues, humans use a lot of assumptions about their environment, when inferring 3D shape. The study of how depth perception depends on several visual and oculomotor cues has influenced the development of modern techniques in 3D imaging. Ideas have also flowed in the opposite direction: results from computer vision indicate what information can be extracted from raw visual data, thereby inspiring theories of human visual perception.
Visual Representations from Paintings to Photographs
A drawing, painting or photograph can be viewed as a form of visual expression and communication. Drawing and painting has a very long history and several discoveries of very old cave paintings have been made, for example, those in the Chauvet cave (France) are more than 32,000 years old. Throughout the history of mankind, the use of paintings, drawings and automatically captured images has been culturally important in many different ways. In terms of the subject matter of this book, the advent of photography in the early 19th century was an important milestone, enabling light reflected from an object or scene to be recorded and retained in a durable way. Photographs have many advantages, such as an accurate, objective visual representation of the scene and a high level of autonomy in image capture. Once photography was born, it was soon realized that measurements of the imaged scene could be made; for example, a map could be built from photographs taken from a balloon. Thus the field of photogrammetry was born, which focuses on extracting accurate measurements from images (this is now often included under the banner of remote sensing). The work done in this field, from the mid-19th to the mid-20th century, is an essential historical precursor to the material presented in this book.
The Role of Digital Images in Today’s Information Society
Today we live in an information society, which is characterized by the economic and cultural importance of information in its many forms. The creation, distribution, analysis and intelligent use of digital information now has a wide range of impacts on everyone in the developed world, throughout all stages of their lives. This can be
Preface |
ix |
anything from the way in which we do our jobs, to the way our health is managed by our doctors; to how our safety is secured on public transport, during air travel and in public spaces; to how our financial systems work; to how we manage our shopping and leisure time. Of course, many of these impacts have been brought about by the advances in computing and communications technology over the past few decades, with the most obvious examples being the Internet, the world wide web and, more recently, mobile access to these resources over wide-bandwidth, wireless technologies.
The information that we use is often described as being multi-media, in the sense that it includes text, audio, graphical figures, images, video and interactive content. Of these different forms, images have always played a hugely important role for obvious reasons: they can convey a huge amount of information in a compact form, they clarify concepts and ideas that are difficult to describe in words, they draw in their audience by making documents more visually appealing and they relate directly to our primary non-contact mechanism for sensing our environment (we humans live and interact in a visual world). For these reasons, images will always be important, but recent advances in both hardware and software technologies are likely to amplify this. For example, digital cameras are now embedded in many devices such as smartphones and the relatively recent explosion in social networking allows groups of people to share images.
Computer Vision
The information society brings with it such a large number of images that we can not hope to analyze them all manually-consider for example the number of images from the security surveillance of a large city over a 24 hour period. The goal of computer vision is to automate the analysis of images through the use of computers and the material presented in this book fits that high level aim. Since there are a large number of ways in which we use images, there is a correspondingly large number of applications of computer vision. In general, automation can improve image analysis performance (people get bored), it increases coverage, it reduces operating costs and, in some applications, it leads to improved safety (e.g. robots working in hazardous environments). The last four decades has seen the rapid evolution of imaging technology and computing power, which has fed the growth of the field of computer vision and the related fields of image analysis and pattern recognition.
The Limitations of Standard 2D Images
Most of the images in our information society are standard 2D color-texture images (i.e. the kind of images that we capture from our mobile phones and digital cameras), or they are effectively sequences of such images that constitute videos. Although this is sufficient for many of the uses that we have described,
x |
Preface |
particularly if the images/video are for direct viewing by a human (e.g. for entertainment or social network use), single, standard 2D images have a number of difficulties when being analyzed by computers. Many of these difficulties stem from the fact that the 3D world is projected down onto a 2D image, thus losing depth information and creating ambiguity. For example, in a 2D image: how do we segment foreground objects from the background? How can we recognize the same object from different viewpoints? How do we deal with ambiguity between object size and distance from the camera? To compound these problems, there is also the issue of how to deal with varying illumination, which can make the same object appear quite different when imaged.
3D Imaging, Analysis and Applications
To address the problems of standard 2D images, described above, 3D imaging techniques have been developed within the field of computer vision that automatically reconstruct the 3D shape of the imaged objects and scene. This is referred to as a 3D scan or 3D image and it often comes with a registered color-texture image that can be pasted over the captured shape and rendered from many viewpoints (if desired) on a computer display.
The techniques developed include both active systems, where some form of illumination is projected onto the scene and passive systems, where the natural illumination of the scene is used. Perhaps the most intensively researched area of 3D shape acquisition has been focused on stereo vision systems, which, like the human visual system, uses a pair of views (images) in order to compute 3D structure. Here, researchers have met challenging problems such as the establishment of correspondences between overlapping images for the dense reconstruction of the imaged scene. Many applications require further processing and data analysis, once 3D shape data has been acquired. For example, identification of salient points within the 3D data, registration of multiple partial 3D data scans, computation of 3D symmetry planes and matching of whole 3D objects.
It is one of today’s challenges to design a technology that can cover the whole pipeline of 3D shape capture, processing and visualization. The different steps of this pipeline have raised important topics in the research community for decades, owing to the numerous theoretical and technical problems that they induce. Capturing the 3D shape, instead of just a 2D projection as a standard camera does, makes an extremely wide array of new kinds of application possible. For instance, 3D and free-viewpoint TV, virtual and augmented reality, natural user interaction based on monitoring gestures, 3D object recognition and 3D recognition for biometry, 3D medical imaging, 3D remote sensing, industrial inspection, robot navigation, to name just a few. These applications, of course, involve much more technological advances than just 3D shape capture: storage, analysis, transmission and visualization of the 3D shape are also part of the whole pipeline.