
Embedded Robotics (Thomas Braunl, 2 ed, 2006)
.pdf
17 Real-Time Image Processing
The coding is shown in Program 17.3. Only a single loop is used to run over all pixels. Again, we neglect a one-pixel-wide borderline; pixels in the first and last row of the result image are set to zero. The program already applies a heuristic scaling (divide by three) and limits the maximum value to white (255), so the result value remains a single byte.
17.4 Motion Detection
The idea for a very basic motion detection algorithm is to subtract two subsequent images (see also Figure 17.5):
1.Compute the absolute value for grayscale difference for all pixel pairs of two subsequent images.
2.Compute the average over all pixel pairs.
3.If the average is above a threshold, then motion has been detected.
Figure 17.5: Motion detection
This method only detects the presence of motion in an image pair, but does not determine any direction or area. Program 17.4 shows the implementation of this problem with a single loop over all pixels, summing up the absolute differences of all pixel pairs. The routine returns 1 if the average difference per pixel is greater than the specified threshold, and 0 otherwise.
Program 17.4: Motion detection
1int motion(image im1, image im2, int threshold)
2{ int diff=0;
3for (i = 0; i < height*width; i++)
4diff += abs(i1[i][j] - i2[i][j]);
5return (diff > threshold*height*width); /* 1 if motion*/
6}
This algorithm could also be extended to calculate motion separately for different areas (for example the four quarters of an image), in order to locate the rough position of the motion.
248

Color Space
17.5 Color Space
Before looking at a more complex image processing algorithm, we take a sidestep and look at different color representations or “color spaces”. So far we have seen grayscale and RGB color models, as well as Bayer patterns (RGGB). There is not one superior way of representing color information, but a number of different models with individual advantages for certain applications.
17.5.1 Red Green Blue (RGB)
The RGB space can be viewed as a 3D cube with red, green, and blue being the three coordinate axes (Figure 17.6). The line joining the points (0, 0, 0) and (1, 1, 1) is the main diagonal in the cube and represents all shades of gray from black to white. It is usual to normalize the RGB values between 0 and 1 for floating point operations or to use a byte representation from 0 to 255 for integer operations. The latter is usually preferred on embedded systems, which do not possess a hardware floating point unit.
(1, 1, 1) white
(0, 0, 1) blue
(0, 1, 0) green
(0, 0, 0) black |
(1, 0, 0) red |
Figure 17.6: RGB color cube
In this color space, a color is determined by its red, green, and blue components in an additive synthesis. The main disadvantage of this color space is that the color hue is not independent of intensity and saturation of the color.
Luminosity L in the RGB color space is defined as the sum of all three components:
L = R+G+B
Luminosity is therefore dependent on the three components R, G, and B.
249

17 Real-Time Image Processing
17.5.2 Hue Saturation Intensity (HSI)
The HSI color space (see Figure 17.7) is a cone where the middle axis represents luminosity, the phase angle represents the hue of the color, and the radial distance represents the saturation. The following set of equations specifies the conversion from RGB to HSI color space:
I |
1 |
(R G B) |
|
|
|
|
|
|
|
||||
|
3 |
|
|
|
3 |
|
|
|
|
|
|
|
|
S |
1 |
|
|
|
|
>min(R,G, B)@ |
|
|
|
||||
(R G B) |
|
|
|
||||||||||
|
|
|
|
|
|
|
|
|
|||||
|
|
|
|
|
1 |
>(R G) (R B)@ |
|
½ |
|||||
H |
cos 1 |
° |
|
|
|
° |
|||||||
2 |
|
||||||||||||
® |
|
|
|
|
|
|
¾ |
||||||
>(R G) |
|
(R B)(G B)@ |
1 |
|
|||||||||
|
|
|
|
° |
2 |
|
2 ° |
||||||
|
|
|
|
¯ |
|
|
¿ |
hue saturation
intensity
Figure 17.7: HSI color cone
The advantage of this color space is to de-correlate the intensity information from the color information. A grayscale value is represented by an intensity, zero saturation, and arbitrary hue value. So it can simply be differentiated between chromatic (color) and achromatic (grayscale) pixels, only by using the saturation value. On the other hand, because of the same relationship it is not sufficient to use the hue value alone to identify pixels of a certain color. The saturation has to be above a certain threshold value.
250

Color Object Detection
17.5.3 Normalized RGB (rgb)
Most camera image sensors deliver pixels in an RGB-like format, for example Bayer patterns (see Section 2.9.2). Converting all pixels from RGB to HSI might be too intensive a computing operation for an embedded controller. Therefore, we look at a faster alternative with similar properties.
One way to make the RGB color space more robust with regard to lighting conditions is to use the “normalized RGB” color space (denoted by “rgb”) defined as:
r |
R |
g |
G |
b |
B |
|
R G B |
R G B |
R G B |
||||
|
|
|
This normalization of the RGB color space allows us to describe a certain color independently of the luminosity (sum of all components). This is because the luminosity in rgb is always equal to one:
r + g + b = 1 (r, g, b)
17.6 Color Object Detection
If it is guaranteed for a robot environment that a certain color only exists on one particular object, then we can use color detection to find this particular object. This assumption is widely used in mobile robot competitions, for example the AAAI’96 robot competition (collect yellow tennis balls) or the RoboCup and FIRA robot soccer competitions (kick the orange golf ball into the yellow or blue goal). See [Kortenkamp, Nourbakhsh, Hinkle 1997], [Kaminka, Lima, Rojas 2002], and [Cho, Lee 2002].
The following hue-histogram algorithm for detecting colored objects was developed by Bräunl in 2002. It requires minimal computation time and is therefore very well suited for embedded vision systems. The algorithm performs the following steps:
1.Convert the RGB color image to a hue image (HSI model).
2.Create a histogram over all image columns of pixels matching the object color.
3.Find the maximum position in the column histogram.
The first step only simplifies the comparison whether two color pixels are similar. Instead of comparing the differences between three values (red, green, blue), only a single hue value needs to be compared (see [Hearn, Baker 1997]). In the second step we look at each image column separately and record how many pixels are similar to the desired ball color. For a 60u80 image, the histogram comprises just 80 integer values (one for each column) with values between 0 (no similar pixels in this column) and 60 (all pixels similar to the ball color).
251

17 Real-Time Image Processing
At this level, we are not concerned about continuity of the matching pixels in a column. There may be two or more separate sections of matching pixels, which may be due to either occlusions or reflections on the same object – or there might be two different objects of the same color. A more detailed analysis of the resulting histogram could distinguish between these cases.
Program 17.5: RGB to hue conversion
1int RGBtoHue(BYTE r, BYTE g, BYTE b)
2/* return hue value for RGB color */
3#define NO_HUE -1
4{ int hue, delta, max, min;
5 |
max |
= |
MAX(r, MAX(g,b)); |
6 |
|||
7 |
min |
= |
MIN(r, MIN(g,b)); |
8delta = max - min;
9hue =0; /* init hue*/
11if (2*delta <= max) hue = NO_HUE; /* gray, no color */
12 |
else |
{ |
/* 1*42 */ |
13 |
if |
(r==max) hue = 42 + 42*(g-b)/delta; |
14else if (g==max) hue = 126 +42*(b-r)/delta; /* 3*42 */
15else if (b==max) hue = 210 +42*(r-g)/delta; /* 5*42 */
16}
17return hue; /* now: hue is in range [0..252] */
18}
Program 17.5 shows the conversion of an RGB image to an image (hue, saturation, value), following [Hearn, Baker 1997]. We drop the saturation and value components, since we only need the hue for detecting a colored object like a ball. However, they are used to detect invalid hues (NO_HUE) in case of a too low saturation (r, g, and b having similar or identical values for grayscales), because in these cases arbitrary hue values can occur.
Input image
with sample column marked
Histogram
0 0 0 0 5 21 32 18 3 0 1 0 2 0 0 0 0 0 0 with counts of matching pixels per column
Column with maximum number of matches
Figure 17.8: Color detection example
252

Color Object Detection
The next step is to generate a histogram over all x-positions (over all columns) of the image, as shown in Figure 17.8. We need two nested loops going over every single pixel and incrementing the histogram array in the corresponding position. The specified threshold limits the allowed deviation from the desired object color hue. Program 17.6 shows the implementation.
Program 17.6: Histogram generation
1 |
int GenHistogram(image hue_img, int obj_hue, |
2 |
line histogram, int thres) |
3/* generate histogram over all columns */
4{ int x,y;
5for (x=0;x<imagecolumns;x++)
6{ histogram[x] = 0;
7for (y=0;y<imagerows;y++)
8if (hue_img[y][x] != NO_HUE &&
9(abs(hue_img[y][x] - obj_hue) < thres ||
10253 - abs(hue_img[y][x] - obj_hue) < thres)
11 |
histogram[x]++; |
12}
13}
Finally, we need to find the maximum position in the generated histogram. This again is a very simple operation in a single loop, running over all positions of the histogram. The function returns both the maximum position and the maximum value, so the calling program can determine whether a sufficient number of matching pixels has been found. Program 17.7 shows the implementation.
Program 17.7: Object localization
1void FindMax(line histogram, int *pos, int *val)
2/* return maximum position and value of histogram */
3int x;
4 { *pos = -1; *val = 0; /* init */
5for (x=0; x<imagecolumns; x++)
6if (histogram[x] > *val)
7{ *val = histogram[x]; *pos = x; }
8}
Programs 17.6 and 17.7 can be combined for a more efficient implementation with only a single loop and reduced execution time. This also eliminates the need for explicitly storing the histogram, since we are only interested in the maximum value. Program 17.8 shows the optimized version of the complete algorithm.
For demonstration purposes, the program draws a line in each image column representing the number of matching pixels, thereby optically visualizing the histogram. This method works equally well on the simulator as on the real
253

17 Real-Time Image Processing
Program 17.8: Optimized color search
1 |
void ColSearch(colimage img, |
int obj_hue, int thres, |
2 |
int *pos, int |
*val) |
3/* find x position of color object, return pos and value*/
4{ int x,y, count, h, distance;
5 *pos = -1; *val = 0; /* init */
6for (x=0;x<imagecolumns;x++)
7{ count = 0;
8for (y=0;y<imagerows;y++)
9{ h = RGBtoHue(img[y][x][0],img[y][x][1],
10 |
img[y][x][2]); |
11if (h != NO_HUE)
12{ distance = abs((int)h-obj_hue); /* hue dist. */
13if (distance > 126) distance = 253-distance;
14if (distance < thres) count++;
15}
16}
17if (count > *val) { *val = count; *pos = x; }
18LCDLine(x,53, x, 53-count, 2); /* visualization only*/
19}
20}
Figure 17.9: Color detection on EyeSim simulator
254

Color Object Detection
robot. In Figure 17.9 the environment window with a colored ball and the console window with displayed image and histogram can be seen.
Program 17.9: Color search main program
1 |
#define |
X |
40 |
// ball coordinates for teaching |
1 |
#define |
Y |
40 |
|
2 |
|
|
|
|
3int main()
4{ colimage c;
5int hue, pos, val;
7LCDPrintf("Teach Color\n");
8LCDMenu("TEA","","","");
9CAMInit(NORMAL);
10while (KEYRead() != KEY1)
11{ CAMGetColFrame(&c,0);
12LCDPutColorGraphic(&c);
13hue = RGBtoHue(c[Y][X][0], c[Y][X][1], c[Y][X][2]);
14LCDSetPos(1,0);
15LCDPrintf("R%3d G%3d B%3d\n",
16 |
c[Y][X][0], c[Y][X][1], c[Y][X][2]); |
17 |
LCDPrintf("hue %3d\n", hue); |
18 |
OSWait(100); |
19 |
} |
20 |
|
21LCDClear();
22LCDPrintf("Detect Color\n");
23LCDMenu("","","","END");
24while (KEYRead() != KEY4)
25{ CAMGetColFrame(&c,0);
26LCDPutColorGraphic(&c);
27ColSearch(c, hue, 10, &pos, &val); /* search image */
28LCDSetPos(1,0);
29LCDPrintf("h%3d p%2d v%2d\n", hue, pos, val);
30 LCDLine (pos, 0, pos, 53, 2); /* vertical line */
31}
32return 0;
33}
The main program for the color search is shown in Program 17.9. In its first phase, the camera image is constantly displayed together with the RGB value and hue value of the middle position. The user can record the hue value of an object to be searched. In the second phase, the color search routine is called with every image displayed. This will display the color detection histogram and also locate the object’s x-position.
This algorithm only determines the x-position of a colored object. It could easily be extended to do the same histogram analysis over all lines (instead of over all columns) as well and thereby produce the full [x, y] coordinates of an object. To make object detection more robust, we could further extend this
255

17 Real-Time Image Processing
algorithm by asserting that a detected object has more than a certain minimum number of similar pixels per line or per column. By returning a start and finish value for the line diagram and the column diagram, we will get [x1, y1] as the object’s start coordinates and [x2, y2] as the object’s finish coordinates. This rectangular area can be transformed into object center and object size.
17.7 Image Segmentation
Detecting a single object that differs significantly either in shape or in color from the background is relatively easy. A more ambitious application is segmenting an image into disjoint regions. One way of doing this, for example in a grayscale image, is to use connectivity and edge information (see Section 17.3, [Bräunl 2001], and [Bräunl 2006] for an interactive system). The algorithm shown here, however, uses color information for faster segmentation results [Leclercq, Bräunl 2001].
This color segmentation approach transforms all images from RGB to rgb (normalized RGB) as a pre-processing step. Then, a color class lookup table is constructed that translates each rgb value to a “color class”, where different color classes ideally represent different objects. This table is a three-dimen- sional array with (rgb) as indices. Each entry is a reference number for a certain “color class”.
17.7.1 Static Color Class Allocation
Optimized for fixed application
If we know the number and characteristics of the color classes to be distinguished beforehand, we can use a static color class allocation scheme. For example, for robot soccer (see Chapter 18), we need to distinguish only three color classes: orange for the ball and yellow and blue for the two goals. In a case like this, the location of the color classes can be calculated to fill the table. For example, “blue goal” is defined for all points in the 3D color table for which blue dominates, or simply:
b > thresholdb
In a similar way, we can distinguish orange and yellow, by a combination of thresholds on the red and green component:
|
blueGoal |
if b ! thresb |
||
° |
||||
colclass ®° yellowGoal |
if |
r ! thresr |
and g ! thresg |
|
°° |
orangeBall |
if |
r ! thresr |
and g thresg |
¯ |
|
|
|
|
If (rgb) were coded as 8bit values, the table would comprise (28)3 entries, which comes to 16MB when using 1 byte per entry. This is too much memory
256

Image Segmentation
for a small embedded system, and also too high a resolution for this color segmentation task. Therefore, we only use the five most significant bits of each color component, which comes to a more manageable size of (25)3 = 32KB.
In order to determine the correct threshold values, we start with an image of the blue goal. We keep changing the blue threshold until the recognized rectangle in the image matches the right projected goal dimensions. The thresholds for red and green are determined in a similar manner, trying different settings until the best distinction is found (for example the orange ball should not be classified as the yellow goal and vice versa). With all thresholds determined, the corresponding color class (for example 1 for ball, 2 or 3 for goals) is calculated and entered for each rgb position in the color table. If none of the criteria is fulfilled, then the particular rgb value belongs to none of the color classes and 0 is entered in the table. In case that more than one criterion is fulfilled, then the color classes have not been properly defined and there is an overlap between them.
17.7.2 Dynamic Color Class Allocation
General technique
However, in general it is also possible to use a dynamic color class allocation, for example by teaching a certain color class instead of setting up fixed topological color borders. A simple way of defining a color space is by specifying a sub-cube of the full rgb cube, for example allowing a certain offset from the desired (taught) value r´g´b´ :
r |
|
[r´–G .. r´+G] |
g |
|
[g´–G .. g´+G] |
b |
|
[b´–G .. b´+G] |
Starting with an empty color table, each new sub-cube can be entered by three nested loops, setting all sub-cube positions to the new color class identifier. Other topological entries are also possible, of course, depending on the desired application.
A new color can simply be added to previously taught colors by placing a sample object in front of the camera and averaging a small number of center pixels to determine the object hue. A median filter of about 4u4 pixels will be sufficient for this purpose.
17.7.3 Object Localization
Having completed the color class table, segmenting an image seems simple. All we have to do is look up the color class for each pixel’s rgb value. This gives us a situation as sketched in Figure 17.10. Although to a human observer, coherent color areas and therefore objects are easy to detect, it is not trivial to extract this information from the 2D segmented output image.
257