Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Ординатура / Офтальмология / Английские материалы / Binocular Vision Development, Depth Perception and Disorders_McCoun, Reeves_2010.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
9.88 Mб
Скачать

In: Binocular Vision

c

ISBN 978-1-60876-547-8

Editors: J. McCoun et al, pp. 189-208

2010 Nova Science Publishers, Inc.

 

 

 

Chapter 9

STEREO-BASED CANDIDATE GENERATION

FOR PEDESTRIAN PROTECTION SYSTEMS

David Geronimo1, Angel D. Sappa1 and Antonio M. Lopez´1,2

1Computer Vision Center and

2Computer Science Department Universitat Autonoma` de Barcelona, 08193, Bellaterra, Barcelona, Spain

Abstract

This chapter describes a stereo-based algorithm that provides candidate image windows to a latter 2D classification stage in an on-boa rd pedestrian detection system. The proposed algorithm, which consists of three stages, is based on the use of both stereo imaging and scene prior knowledge (i.e., pedestrians are on the ground) to reduce the candidate searching space. First, a successful road surface fitting algorith m provides estimates on the relative ground-camera pose. This stage directs the search toward the road area thus avoiding irrelevant regions like the sky. Then, three different schemes are used to scan the estimated road surface with pedestrian-sized windows: (a) uniformly distributed through the road surface (3D); (b) uniformly distributed through the image (2D); (c) not uniformly distributed but according to a quadratic function (combined 2D3D). Finally, the set of candidate windows is reduced by analyzing their 3D content. Experimental results of the proposed algorithm, together with statistics of searching space reduction are provided.

E-mail address: dgeronimo@cvc.uab.es

190 David Geronimo, Angel D. Sappa and Antonio M. Lopez´

1.Introduction

According to the World Health Organization, every year almost 1.2 million people are killed and 50 million are injured in traffic accidents worldwide [11]. These dramatic statistics highlight the importance of the research in traffic safety, which involves not only motor companies but also governments an d universities.

Since the early days of the automobile, in the beginning of 20th century, and along with its popularization, different mechanisms were successfully incorporated to the vehicle with the aim of improving its safety. Some examples are turn signals, seat-belts and airbags. These mechanisms, which rely on physical devices, were focused on improving safety specifically when accidents w here happening. In the 1980s a sophisticated new line of research began to pursue safety in a preventive way: the so-called advanced driver assistance systems (ADAS). These systems provide information to the driver and perform active actions (e.g., automatic braking) by the use of different sensors and intelligent computation. Some ADAS examples are adaptive cruise control (ACC), which automatically maintains constant distance to a front-vehicle in the same lane, and lane departure warning (LDW), which warns when the car is driven out the lane unadvertently.

One of the more complex ADAS are pedestrian protection systems (PPSs), which aim at improving the safety of these vulnerable road users. Attending to the number of people involved in vehicle-to-pedestrian accidents, e.g., 150 000 injured and 7 000 killed people each year in the European Union [6], it is clear that any improvement in these systems can potentially save many human lifes. PPSs detect the presence of people in a specific area of interest aroun d the host vehicle in order to warn the driver, perform braking actions and deploy external airbags in the case of an unavoidable collision. The most used sensor to detect pedestrians are cameras, contrary to other ADAS such as ACC, in which active sensors like radar or lidar are employed. Hence, Computer Vision (CV) techniques play a key role in this research area, which is not strange given that vision is the most used human sense when driving. People detection has been an important topic of research since the beginning of CV, and it has been mainly focused on applications like surveillance, image retrieval and human-machine interfaces. However, the problem faced by PPSs differs from these applications and is far from being solved. The main challenges of PPSs are summarized in the following points:

Stereo-Based Candidate Generation...

191

Pedestrians have a high variability in pose (human body can be viewed as a highly deformable target), clothes (which change with the weather, culture, and people), distance (typically from 5 to at least 25m), sizes (not only adults and children are different, but also there are many different human constitutions), viewpoints (e.g., front, back or side viewed).

The variability of the scenarios is also considerable, i.e., the detection takes place in outdoor dynamic urban roads with cluttered background and illumination changes.

The requirements in terms of misdetections and computational cost are hard: these systems must perform real-time actions at very low miss rates.

The first research works in PPSs were presented in the late 1990s. Pap ageorgiou et al. [10] proposed to extract candidate windows by exhaustively scanning the input image and classify them with support vector machines based on Haar Wavelet features. This two-step candidate generation and classification s cheme has been used in a countless number of detection systems: from faces [14], vehicles or generic object detection to human surveillance and image retrieval [3]. The simplest candidate generation approach is the exhaustive scan, also called sliding window: it consists in scanning the input image with pedestrian-sized windows (i.e., with a typical aspect ratio around 1/2) at all the possible scales and positions. Although this candidate generation method is generic and easy to implement, it can be improved by making use of some prior knowledge from the application. Accordingly, during the last decade researchers have tried to exploit the specific aspects of PPSs to avoid this generation technique. Some cu es used for generating candidates are vertical symmetry [1], infrared hot spots [4] and 3D points [7]. However, the proposed techniques that exploit them pose several problems that make the systems not reliable in real-world scenarios. For example, in the case of 2D analysis, the number of false negatives (i.e., discarded pedestrians) can not be guaranteed to be low enough: symmetry relies on vertical edges, but in many cases the illumination conditions or background clutter make them disappear. Hot spot analysis in infrared images holds a similar problem because of the environmental conditions [2]. On the other hand, although stereo stands as a more reliable cue, the aforementioned techniques also hold problems. In the case of [7], the algorithm assumes a constant road slope, so the problems appear when the road orientation is not constant which is common in urban scenarios.

192 David Geronimo, Angel D. Sappa and Antonio M. Lopez´

This chapter presents a candidate generation algorithm that reduces the number of windows to be classified while minimizes the number of wrongly discarded targets. This is achieved by combining a prior-knowledge criterion, pedestrians-on-the-ground, and using 3D data to filter the candidates. This procedure can be seen as a conservative but reliable approach, which in our opinion is the most convenient option for this early step of the system.

The remainder of the manuscript is organized as follows. First, we introduce the proposed candidate generation algorithm with a brief description of its components and their objective. Then, the three stages in which the algorithm is divided are presented: Sect. 3. describes the road surface estimation algorithm, Sect. 4. presents the road scanning and Sect. 5. addresses the candidate filtering. Finally, Sect. 6. provides experimental results of the algorithm output. In Sect. 7., conclusions and future work is presented.

2.Algorithm Overview

A recent survey on PPSs by Geronimo´ et al. [8] proposes a general architecture that consists of six modules, in which most of the existing systems can be fit. The modules (enumerated in the order of the pipeline process) are: 1 ) preprocessing, 2) foreground segmentation, 3) object classification, 4) verification and refinement, 5) tracking and 6) application. As can be seen, modules 2) and 3) correspond to the steps presented in the introduction. The algorithm presented in this chapter consists in a candidate generation algorithm to be used in the foreground segmentation module, which gets an input image and generates a list of candidates where a pedestrian is likely to appear, to be sent to the next module, the classifier. There are two main objectives to be carried o ut in this module. The first is to reduce the number of candidates, which directly affects the performance of the system both in terms of speed (the fewer the candidates sent to the classifier the less the computation time is) and detection rates (negatives can be pre-filtered by this module). The second is not to disca rd any pedestrian, otherwise the later modules will not be able to correct the wrong filtering.

The proposed algorithm is divided into three stages, as illustrated in Fig. 1.

1.Road surface estimation computes the relative position and orientation between the camera and the scene (Sect. 3.).