- •1 Seeing: Blazing Processing Characteristics
- •1.1 An Infinite Reservoir of Information
- •1.2 Speed
- •1.3 Illusions
- •1.4 Recognition Evolvement
- •1.5 Basic-Level Categorization
- •1.6 Memory Capacity and Access
- •1.7 Summary
- •2.1 Structural Variability Independence
- •2.2 Viewpoint Independence
- •2.3 Representation and Evolvement
- •2.3.1 Identification Systems
- •2.3.3 Template Matching
- •2.3.4 Scene Recognition
- •2.4 Recapitulation
- •2.5 Refining the Primary Engineering Goal
- •3 Neuroscientific Inspiration
- •3.1 Hierarchy and Models
- •3.2 Criticism and Variants
- •3.3 Speed
- •3.5 Alternative Shape Recognition
- •3.6 Insight from Cases of Visual Agnosia
- •3.7 Neuronal Level
- •3.8 Recapitulation and Conclusion
- •4 Neuromorphic Tools
- •4.1 The Transistor
- •4.2 A Synaptic Circuit
- •4.3 Dendritic Compartments
- •4.4 An Integrate-and-Fire Neuron
- •4.5 A Silicon Cortex
- •4.6 Fabrication Vagrancies require Simplest Models
- •4.7 Recapitulation
- •5 Insight From Line Drawings Studies
- •5.1 A Representation with Polygons
- •5.2 A Representation with Polygons and their Context
- •5.3 Recapitulation
- •6 Retina Circuits Signaling and Propagating Contours
- •6.1 The Input: a Luminance Landscape
- •6.2 Spatial Analysis in the Real Retina
- •6.2.1 Method of Adjustable Thresholds
- •6.2.2 Method of Latencies
- •6.3 The Propagation Map
- •6.4 Signaling Contours in Gray-Scale Images
- •6.4.1 Method of Adjustable Thresholds
- •6.4.2 Method of Latencies
- •6.4.3 Discussion
- •6.5 Recapitulation
- •7 The Symmetric-Axis Transform
- •7.1 The Transform
- •7.2 Architecture
- •7.3 Performance
- •7.4 SAT Variants
- •7.5 Fast Waves
- •7.6 Recapitulation
- •8 Motion Detection
- •8.1 Models
- •8.1.1 Computational
- •8.1.2 Biophysical
- •8.2 Speed Detecting Architectures
- •8.3 Simulation
- •8.4 Biophysical Plausibility
- •8.5 Recapitulation
- •9 Neuromorphic Architectures: Pieces and Proposals
- •9.1 Integration Perspectives
- •9.2 Position and Size Invariance
- •9.3 Architecture for a Template Approach
- •9.4 Basic-Level Representations
- •9.5 Recapitulation
- •10 Shape Recognition with Contour Propagation Fields
- •10.1 The Idea of the Contour Propagation Field
- •10.2 Architecture
- •10.3 Testing
- •10.4 Discussion
- •10.5 Learning
- •10.6 Recapitulation
- •11 Scene Recognition
- •11.1 Objects in Scenes, Scene Regularity
- •11.2 Representation, Evolvement, Gist
- •11.3 Scene Exploration
- •11.4 Engineering
- •11.5 Recapitulation
- •12 Summary
- •12.1 The Quest for Efficient Representation and Evolvement
- •12.2 Contour Extraction and Grouping
- •12.3 Neuroscientific Inspiration
- •12.4 Neuromorphic Implementation
- •12.5 Future Approach
- •Terminology
- •References
- •Index
- •Keywords
- •Abbreviations
64 |
Retina Circuits Signaling and Propagating Contours |
At t=11, the first spikes are signaled representing the silhouette areas. No contours per se are signaled within that time slice, but after this the border spikes start to propagate across the lower-intensity (darker) areas (t=13 and later). Darker areas are signaled later, like the lines dividing the drawers. In this specific simulation the signaling of bright areas starts late (at t =11), and causes propagation across many dark areas before those have been signaled. Other contours can be signaled by adjusting the parameter values accordingly, but that needs to be further elaborated. This specific transformation does not offer any contour propagation across the bright areas, but could be achieved by feeding the spikes of this transformation into a subsequent propagation map.
6.4.3 Discussion
The method of adjustable spiking thresholds signals contours relatively immediately as compared to the method of latencies and propagates them across its map. It therefore represents a compact way to obtain contours and their propagation simultaneously. The method of latencies is somewhat more intricate: the contours are stretched out in time and the retinal network does provide only partial propagation. But it offers the following advantages. Firstly, it may be easier to implement into analog hardware than the method of adjustable thresholds, whose fast process is not explicitly simulated. Secondly, if one looked at a neuron ‘patiently’, then one would observe a firing rate which actually reflects the intensity of the pixel. This latter point has already been suggested by Thorpe: the latency information would provide fast computation, the rate information would provide slower computation. Specifically applied to the purpose of categorization, the latencies provide enough for generating the contour image and subsequent perceptual categorization, but for determining other aspects of visual information, which may happen on a slower time scale, a rate code may serve well.
6.5 Recapitulation
We have started this chapter by reviewing Fu’s description of the luminance profile as a geographical map. We have not specifically addressed the diversity of this profile, but one should keep it in mind, if one plans to extend this approach into low-resolution gray-scale images in which structure can be very subtly embedded.
We have proposed two methods for contour extraction. The method of adjustable thresholds is a process that detects steep gradients (contours) in a luminance profile. Initially, very high-contrast contours are signaled, followed by signaling lower-contrast contours. Due to the propagation process, gaps in the contour image are filled by the
6.5 Recapitulation |
65 |
expanding and merging propagation process. To put it into a succinct phrase: contour propagation seals gaps. In the method of latencies, the edges are separated in time.
The output of the retina looks very similar to the output of an edgedetection algorithm used in computer vision. Still, the obtained contours are fragmented - as they are with any method performing contour detection. In computer vision, much effort has been directed toward obtaining a more complete contour image, often combined with efforts that are called image segmentation (Palmer, 1999). Yet, the fragmented contour image already provides an enormous wealth of structural information: it delineates many regions as we have determined them in our line drawing studies. The output therefore suffices for the perceptual categorization we aim at (figure 2, left side). The loose representations that we search for, have to be able to cope with part-shape variability, part-alignment variability and part redundancy anyway (section 2.1): It therefore does not matter, whether there is one or the other contour missing after the contour extraction process. This lack of contour pieces is likely to the smaller problem in the construction of a visual system - if it is one at all -, than the challenge to deal with structural variability.
The simulations presented so far have been merely software simulations. How either retina model can be translated into analog VLSI remains to be worked out. A starting point would be to develop the propagation map using maybe the propagation tools for neuromorphic dendrites (section 4.3); in the next step, one would insert the contour detection mechanism.
This page intentionally left blank
