Kluwer - Handbook of Biomedical Image Analysis Vol
.1.pdf486 |
Farag, Ahmed, El-Baz, and Hassan |
component. To estimate the distribution for each class, we use the expectation maximization algorithm. The first step to estimate the distribution for each class is to estimate the dominant Gaussian components in the given empirical distribution.
9.2.5.1 Dominant Gaussian Components Extracting Algorithm
1.Assume the number of Gaussian components that represent the classes
i, i = 1, ..., m. Initialize the parameters of each distribution randomly.
2.The E-step: Compute δit that represent responsibility that the given pixel value is extracted from certain distribution as
δk |
|
πik p(yt | ik, i) |
, |
for t 1 to N2, |
(9.8) |
|
|
|
|||||
it |
|
m |
πlk p(yt | lk, l ) |
|
= |
|
|
= l=1 |
|
|
where yt is the gray level at location t in the given image, πik is the mixing proportion of Gaussian component i at step k, and ik is estimated parameter for Gaussian component i at step k.
3.The M-step: we compute the new mean, the new variance, and the new proportion from the following equations:
|
|
|
|
|
|
N2 |
|
|
|
|
|
|
|
|
πik+1 |
= |
|
2 |
|
|
|
|
(9.9) |
||||||
|
t=1 |
δit , |
|
|
|
|||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
||
µk+1 |
|
|
|
N |
|
δk yt |
|
|
|
|
||||
= |
|
|
t=1 it |
, |
|
|
(9.10) |
|||||||
i |
|
|
|
|
|
N2 |
|
|
|
|
||||
|
|
|
|
t |
|
1 δitk |
|
|
|
|
||||
|
|
|
|
|
|
|
N2= |
|
|
|
µik)2 |
|
||
k |
1 |
|
2 |
|
|
t |
1 δitk (yt |
|
|
|||||
(σi + |
|
) |
|
= |
|
|
= |
|
N2 |
|
− |
. |
(9.11) |
|
|
|
|
|
|
|
t=1 δitk |
|
|
4.Repeat steps 1 and 2 until the relative difference of the subsequent values of Eqs. 9.9, 9.10, and 9.11 are sufficiently small.
Let pI 1(y), pI 2(y), . . . , pI m(y) be the dominant Gaussian components that are estimated from the above algorithm. Then the initial estimated density ( pI (y)) for the given image can be defined as follows:
pI (y) = π1 pI 1(y) + π2 pI 2(y) + · · · + πm pI m(y). |
(9.12) |
Because the empirical data does not exactly follow mixture of normal distribution, there will be error between pI (y) and pem(y). So we suggest the following
488 |
Farag, Ahmed, El-Baz, and Hassan |
6.Repeat steps 2, 3, 4, and 5, and increase the number of Gaussian components n by 1 if the conditional expectation Q(n) is still increasing and
(n) is still decreasing, otherwise stop and select the parameters which correspond to maximum Q(n) and minimum (n).
Since EM algorithm can be trapped in a local minimum, we run the above algorithm several times and select the number of Gaussian components and their parameters that give maximum Q(n) and minimum (n).
After we determined the number of Gaussian components that formed |ζ (y)|, we need to determine which components belong to class 1, and belong to class
2, and so on. In this model we classify these components based on the minimization of risk function under 0–1 loss. In order to minimize the risk function, we can use the following algorithm. Note that the following algorithm is writen for two classes but it is easy to generalize to n classes.
9.2.5.3 Components Classification Algorithm
1.All Gaussian components that have mean less than the estimated mean for pI 1(y) belong to the first class.
2.All Gaussian components that have mean greater than the estimated mean for pI 2(y) belong to the second class.
3.For the components which have mean greater than the estimated mean for pI 1(y) and less than the estimated mean for pI 2(y), do the following:
(a)Assume that the first component belongs to the first class and the other components belong to the second class. Compute the risk value from the following equation:
∞ |
Th |
|
R(Th) = Th |
p(y| 1)dy + −∞ p(y| 2)dy, |
(9.21) |
where Th is the threshold that separates class 1 from class 2. The above integration can be done using a second-order spline.
(b)Assume that the first and second components belong to the first class and other components belong to the second class, and from Eq. 9.21 compute R(Th). Continue this process as R(Th) decreases, and stop when R(Th) starts to increase.
Advanced Segmentation Techniques |
489 |
Finally, to show the convergence of the proposed model, we will show experimentally, when we use this model, the Levy distance will decrease between the estimated distribution Pes(y) and empirical distribution Pem(y). The Levy distance ρ(Pem, Pes) is defined as
ρ(Pem, Pes) = inf{ξ > 0 : yPem(y − ξ ) − ξ ≤ Pes(y) ≤ Pem(y + ξ ) + ξ }.
(9.22)
As ρ(Pem, Pes) approach zero, Pes(y) converge weakly to Pem(y).
9.2.6 Parameter Estimation for High-Level Process
In order to carry out the MAP parameters estimation in Eq. 9.5, one needs to specify the parameters of high-level process. A popular model for the high-level process is the Gibbs Markov model which follows Eq. 9.2. In order to estimate the parameters of GMRF, we will find the parameters that maximize Eq. 9.2, and we will use the Metropolis algorithm and genetic algorithm (GA).
The Metropolis algorithm is a relaxation algorithm to find a global maximum. The algorithm assumes that the classes of all neighbors of yare known. The highlevel process is assumed to be formed of m-independent processes; each of the m processes is modeled by Gibbs Markov random which follow Eq. 9.2. Then y can be classified using the fact that p(xi|y) is proportional to p(y|xt ) P(xt |ηs), where s is the neighbor set to site S belonging to class xt , p(xt |ηs) is computed from Eq. 9.2, and p(y|xt ) is computed from the estimated density for each class.
By using the Bayes classifier, we get initial labeling image. In order to run the Metropolis algorithm, first we must know the coefficients of potential function
E(x), so we will use GA to estimate the coefficient of E(x) and evaluate these coefficients through the fitness function.
9.2.6.1 Maximization Using Genetic Algorithm
To build the genetic algorithm, we define the following parameters:
Chromosome: A chromosome is represented in binary digits and consists of representations for model order and clique coefficients. Each chromosome has 41 bits. The first bit represent the order of the system (we use digit “0” for firstorder and digit “1” for second-order-GMRF). The remaining bits represent the
490 |
Farag, Ahmed, El-Baz, and Hassan |
clique coefficients, where each clique coefficient is represented by 4 bits (note that for first-order system, we estimate only five parameters, and the remaining clique’s coefficient will be zero, but for the second-order system we will estimate ten parameters).
Fitness Function: Since our goal is to select the high-level process X that maximize Eq. 9.5, we can use Eq. 9.5 as the fitness function.
High-level parameters estimation algorithm:
1.Generate the first generation which consists of 30 chromosomes.
2.Apply the Metropolis algorithm for each chromosome on each image and then compute the fitness function as shown in Eq. 9.5.
3.If the fitness values for all chromosomes do not change from one population to another population, then stop and select the chromosome, which gives maximum fitness value. (If there are two chromosomes that give the same fitness value, we select the chromosome which represents lower order system.) Otherwise go to step 2.
Using the results obtained by this algorithm, we will repeat the estimation of low-level process and high-level process. We will stop when the difference between the current parameters and previous parameters is small.
9.3 Applications
Lung Cancer remains the leading cause of mortality cancer. In 1999, there were approximately 170 000 new cases of lung cancer [21]. The 5-year survival rate from the diseases is 14% and has increased only slightly since the early 1970s despite extensive and expensive research work to find effective therapy. The disparity in survival between early and late-stage lung cancer is substantial, with a 5-year survival rate of approximately 70% in stage 1A disease compared to less than 5% in stage IV disease according to the recently revised lung cancer staging criteria [21]. The disproportionately high prevalence of and mortality from lung cancer has encouraged attempts to detect early lung cancer with screening programs aimed at smokers. Smokers have an incidence rate of lung
Advanced Segmentation Techniques |
491 |
cancer that is ten times that of nonsmokers and accounts for greater than 80% of lung cancer cases in the United States [21].
One in every 18 women and every 12 men develop lung cancer, making it the leading cause of cancer deaths. Early detection of lung tumors (visible on the chest film as nodules) may increase the patient’s chance of survival. For this reason the Jewish Hospital designed a program for early detection with the following specific aims: A number of lung cancer screening trials have been conducted in the United States, Japan, and Europe for the purpose of developing an automatic approach of tummor detection [21].
At the University of Louisville CVIP Lab a long-term effort has been ensued to develop a comprehensive image analysis system to detect and recognize lung nodules in low dose chest CT (LDCT) scans. The LDCT scanning was performed with the following parameters: slice thickness of 8 mm reconstructed every 4 mm and scanning pitch of 1.5. In the following section we highlight our approach for automatic detection and recognition of lung nodules; further details can be found in [22].
9.3.1 Lung Extraction
The goal of lung extraction is to separate the voxels corresponding to lung tissue from those belonging to the surrounding anatomical structures. We assume that each slice consists of two types of pixels: lung and other tissues (e.g., chest, ribs, and liver). The problem in lung segmentation is that there are some tissues in the lung such as arteries, veins, bronchi, and bronchioles having gray level close to the gray level of the chest. Therefore, in this application if we depend only on the gray level we lose some of the lung tissues during the segmentation process. Our proposed model which depends on estimating parameters for two processes (high-level process and low-level process) is suitable for this application because the proposed model not only depend on the gray level but also takes into consideration the characterization of spatial clustering of pixels into regions.
We will apply the approach that was described in Section 9.2.4 on lung CT. Figure 9.4 shows a typical CT slice for the chest. We assume that each slice consists of two types of tissues: lung and other tissues (e.g., chest, ribs, and liver). As discussed above, we need to estimate parameters for both low-level process and high-level process. Table 9.1 presents the results of applying the