Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
[2.1] 3D Imaging, Analysis and Applications-Springer-Verlag London (2012).pdf
Скачиваний:
12
Добавлен:
11.12.2021
Размер:
12.61 Mб
Скачать

140

W.A.P. Smith

olution and compression of this data that determines the volume of data stored. In turn, the challenges for storing, manipulating and visualizing this data grow as the volume increases.

Chapter Outline In this chapter, we provide an overview of how 3D data can be stored, modeled and visualized. We begin by providing a taxonomy of 3D data representations. We then present, in more detail, a selection of the most important 3D data representations. Firstly, we focus on triangular meshes, describing the data structures available for efficiently processing such data. We take, as an example, the halfedge data structure and provide some implementation details. Secondly, we describe schemes for subdivision surfaces. Having considered methods for representing 3D data, we then discuss how local differential surface properties can be computed for the most common representation. Next, we describe how 3D data can be compressed and simplified before finally discussing visualization of 3D data, providing examples of the sorts of visualizations that are available using commonlyused tools.

4.2 Representation of 3D Data

The representation of 3D data is the foundation of a number of important applications, such as computer-aided geometric design, visualization and graphics. In this section, we summarize various 3D representations which we classify as: raw data (i.e. delivered by a 3D sensing device), surfaces (i.e. 2D manifolds embedded in 3D space) and solids (i.e. 3D objects with volume).

4.2.1 Raw Data

The raw output of a 3D sensor can take a number of forms, such as points, a depth map and polygons. Often, data represented in these raw forms requires further processing prior to analysis. Moreover, these representations may permit non-manifold or noisy surfaces to be represented which may hinder subsequent analysis.

4.2.1.1 Point Cloud

In its simplest form, 3D data exists as a set of unstructured 3-dimensional coordinates called a point cloud, P, where P = {v1, . . . , vn} and vi R3. Typically, a point cloud of n points is stored as an n × 3 array of floating point numbers or a linked list of n vertex records. Point clouds arise most commonly in vision as the output of multiview stereo [22] or related techniques such as SLAM (simultaneous localization and mapping) [63]. They also arise from laser range scanning

4 Representing, Storing and Visualizing 3D Data

141

devices, where the 3D positions of vertices lying along the intersection between a laser stripe and the surface are computed. Vertices may be augmented by additional information such as texture or, in the case of oriented points, a surface normal [28]. A visualization of a point cloud is shown in Fig. 4.21(a). In order to further process point cloud data, it is often necessary to fit a smooth surface to data in a manner which is robust to noise in the point positions. However, the direct rendering of vertex data (known as point-based rendering) has developed as a sub-field within graphics that offers certain advantages over traditional polygon-based rendering [57].

4.2.1.2 Structured Point Cloud

A more constrained representation may be used when point cloud vertices adhere to an underlying structure, namely a grid with arbitrary sampling. In this case, vertices are stored in an ordered m × n × 3 array and, for each point i = 1..m, j = 1..n, there is a corresponding 3D vertex [x(i, j ) y(i, j ) z(i, j )]T R3. Moreover, the ordering of the points is such that adjacent vertices share adjacent indices. There is an implicit mesh connectivity between neighboring points and nonboundary points are always degree 6. Conversion to a triangular mesh is straightforward, by constructing an edge between all pairs of adjacent vertices. Often, there is an additional binary 2D array of size m × n which indicates the presence or absence of 3D data (for example, parts of the surface being imaged may have poor reflectance). Instead of binary value, a scalar “confidence” value can be stored providing an indication of measurement uncertainty at each point. Finally, a grayscale or color-texture image of the same dimensions may also be associated with the 3D data. In this case, the format provides an implicit correspondence between 2D pixels and 3D vertices, assuming that the 3D camera captures and stores such information. An example of a commonly used structured point cloud dataset is the 3D face data in the Face Recognition Grand Challenge version 2 data release [52].

4.2.1.3 Depth Maps and Range Images

A special case of structured point cloud arises when the sampling of points in the x y plane is viewer-centered. Although often used interchangeably, we define a range image as a structured point cloud which arises from a perspective projection and a depth map as an orthogonal projection and regular sampling of 3D vertices over a 2D image plane. Both representations have the advantage that they can be represented by a 2D function z(x, y). Hence, these representations require less storage than those which allow variable spacing of points in the (x, y) plane and can effectively be stored (and compressed) as an image. In the case of a depth map, the only additional information required to reconstruct 3D vertex position is the fixed spacings, x and y . In the case of a range image, parameters related to the camera

142

W.A.P. Smith

projection (e.g. focal length and center of projection) must also be stored. Depth maps and range images can be visualized as grayscale images, whereby image intensity represents the distance to the surface (see Fig. 4.21(d)). Alternatively, they can be converted into a triangular mesh and rendered. Since the vertices are evenly distributed over the image plane, a regular triangulation can be used. Range images are the natural representation for binocular stereo [58] where, for each pixel, a disparity value is calculated that is related to depth. In addition, range images are often computed as an intermediate representation as part of the rendering pipeline. Here they are used for z-buffering and to efficiently simulate many visual effects such as depth of field and atmospheric attenuation.

4.2.1.4 Needle map

Photometric shape reconstruction methods often recover an intermediate representation comprising per-pixel estimates of the orientation of the underlying surface z(x, y). In graphics this is known as a bump map. This is either in the form of surface gradients, i.e. p(x, y) = x z(x, y) and q(x, y) = y z(x, y), or surface normals, i.e. n(x, y) = [p(x, y) q(x, y) 1]T . A needle map can be rendered by using a reflectance function to locally shade each pixel. Alternatively, a depth map can be estimated from surface normals via a process known as surface integration (see [55] for a recently reported approach). This is a difficult problem when the surface normal estimates are noisy or subject to bias. When augmented with depth estimates, potentially at a lower resolution, the two sources of information can be combined to make a robust estimate of the surface using an efficient algorithm due to Nehab et al. [45]. This approach is particularly suitable where the depth map is subject to high frequency noise (e.g. from errors in stereo correspondence) and the surface normals subject to low frequency bias (e.g. when using photometric stereo with inaccurate light source directions).

4.2.1.5 Polygon Soup

A polygon soupPolygon soup is, in some senses, analogous to point cloud data, but comprises polygons rather than vertices. More precisely, it is a set of unstructured polygons [44], each of which connect vertices together but which are not themselves connected in a coherent structure such as a mesh. Such models may arise in an interactive modeling system where a user creates and places polygons into a scene without specifying how the polygons connect to each other. This sort of data may contain errors such as: inconsistently oriented polygons; intersecting, overlapping or missing polygons; cracks (shared edges not represented as such); or T-junctions. This causes problems for many applications including rendering, collision detection, finite element analysis and solid modeling operations. To create a closed surface, a surface fitting algorithm must be applied to the unstructured polygons. For example, Shen et al. [60] show how to fit an implicit surface to polygon soup data.